Nesting queries - sql

My query from the attached schema is asking me to look for the same location of where the people who tested positive went and were in the same people as the untested people. (Untested means the people not there in the testing table.
--find the same locations of where the positive people and the untested people went
select checkin.LocID, checkin.PersonID
from checkin join testing on checkin.personid = testing.personid
where results = 'Positive'
and (select CheckIn.PersonID
from checkin join testing on checkin.PersonID = testing.PersonID where CheckIn.PersonID
not in (select testing.PersonID from testing));
In my view the query is stating the following
To select a location and person from joining the checking and testing table and the result is positive and to select a person from the check in table who is not there in the testing table.
Since the answer I am getting is zero and I know manually there are people. What am I doing wrong?
I hope this makes sense.

You can get the people tested 'Positive' with this query:
select personid from testing where results = 'Positive'
and the untested people with:
select p.personid
from person p left join testing t
on t.personid = p.personid
where t.testingid is null
You must join to each of these queries a copy of checkin and these copies joined together:
select l.*
from (select personid from testing where results = 'Positive') p
inner join checkin cp on cp.personid = p.personid
inner join checkin cu on cu.lid = cp.lid
inner join (
select p.personid
from person p left join testing t
on t.personid = p.personid
where t.testingid is null
) pu on pu.personid = cu.personid
inner join location l on l.locationid = cu.lid

If what you want are the positive people who are at a location where there is also someone who is not tested, you might consider:
select ch.LocID,
group_concat(case when t.results = 'positive' then ch.PersonID end) as positive_persons
from checkin ch left join
testing t
on ch.personid = t.personid
group by ch.LocId
having sum(case when t.results = 'positive' then 1 else 0 end) > 0 and
count(*) <> count(t.personid); -- at least one person not tested
With this structure, you can get the untested people using:
group_concat(case when t.personid is null then ch.personid)

You have several mistakes (missing exists, independent subquery in exists). I believe that this should do the work
select ch1.LocID, ch1.PersonID
from checkin ch1
join testing t1 on ch1.personid = t1.personid
where results = 'Positive'
and exists (
select 1
from checkin ch2
where ch1.LocID = ch2.LocID and ch2.PersonID not in (
select testing.PersonID
from testing
)
);

Related

Time overlaps from Nesting queries

Based on the current schema I have been asked to find
-- people who were untested and exposed to some one infectious
-- Do not list anyone twice and do not list known sick people
-- Exposed = at the same place, and overlap in time (No overlap time needed for simplicity)
From the query below I find my answer except I cannot remove the people who are 'postive' because the second part my query i.e the time lapse depends on the first part i.e the time the positive people went to the same locations.
select * from (
select DISTINCT person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person
join Register on Person.PersonID = Register.PersonID
join testing on person.PersonID = testing.PersonID
where testing.Results is 'Positive' ) a
join (
SELECT DISTINCT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID
where person.PersonID
not in (SELECT DISTINCT testing.PersonID from testing)) b on a.LocID = b.LocID
and b.checkin >= a.CheckIn and b.CheckIn <= a.CheckOut
So my question is, What modification does this query need to show the results of the results of the second part only?
I consider the first part to be
select * from (
select DISTINCT person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person
join Register on Person.PersonID = Register.PersonID
join testing on person.PersonID = testing.PersonID
where testing.Results is 'Positive' ) a
And the second part to be
join (
SELECT DISTINCT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID
where person.PersonID
not in (SELECT DISTINCT testing.PersonID from testing)) b on a.LocID = b.LocID
and b.checkin >= a.CheckIn and b.CheckIn <= a.CheckOut
For readability you can create CTEs like this:
with
-- returns all the untested persons
untested as (select p.* from person p left join testing t on t.personid = p.personid where t.testingid is null),
-- returns all the infected persons
infected as (select * from testing where results = 'Positive'),
-- returns all the locids that infected persons visited and the start and dates of these visits
loc_positive as (
select r.locid, i.timestamp startdate, r.checkout enddate
from register r inner join infected i
on i.personid = r.personid and i.timestamp between r.checkin and r.checkout
)
-- returns the distinct untested persons that visited the same locids with persons tested positive at the same time after they were tested
select distinct u.*
from untested u
inner join register r on r.personid = u.personid
inner join loc_positive lp on lp.locid = r.locid
where lp.startdate <= r.checkout and lp.enddate >= r.checkin
This is a complicated query. Because you do not want duplicates, I am going to suggest exists with the outer query just using persons.
The idea to get people in the same place at the same time is a self-join on register using both location and time overlaps. I think that is the most complex part of the query. The rest is checking if a person is or is not positive:
select p.*
from person p
where not exists (select 1
from testing t
where t.personid = p.personId and t.results = 'positive'
) and
exists (select 1
from register r1 join
register r2
on r1.locid = r2.locid and
r1.checkin < r2.checkout and
r2.checkout > r1.checkin join
testing t2
on r2.personid = t2.personid and
t2.results = 'positive' and
t2.timestamp < r2.checkout
where r1.personid = p.personid
);
The timing is a little tricky, but I think the timing makes sense. Someone needs to test positive before they are in the same place. Of course, you can remove the t2.timestamp < r2.checkout if there is no timing constraint.
The solution to this answer was to add a distinct and a column name to the star in the first line.
select DISTINCT unt.PersonID from (
select person.PersonID, Register.LocID, Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID join testing on person.PersonID = testing.PersonID
where testing.Results is 'Positive' ) pos
join (
SELECT Person.PersonID, Register.LocID , Register.Checkin, Register.CheckOut
from person join Register on Person.PersonID = Register.PersonID where person.PersonID
not in (SELECT testing.PersonID from testing)) unt on pos.LocID = unt.LocID
and unt.checkin >= pos.CheckIn and unt.CheckIn <= pos.CheckOut;

Access Subquery On mulitple conditions

This SQL query needs to be done in ACCESS.
I am trying to do a subquery on the total sales, but I want to link the sale to the province AND to product. The below query will work with one or the other: (po.product_name = allp.all_products) AND (p.province = allp.all_province); -- but it will no take both.
I will be including every month into this query, once I can figure out the subquery on with two criteria.
Select
p.province as [Province],
po.product_name as [Product],
all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id)
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province
)
as allp
on (po.product_name = allp.all_products) AND (p.province = allp.all_province);
Make the first select sql into a table by giving it an alias and join table 1 to table 2. I don't have your table structure or data to test it but I think this will lead you down the right path:
select table1.*, table2.*
from
(Select
p.province as [Province],
po.product_name as [Product]
--removed this ,all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id) table1
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province --check your group by, I dont think you want pp1.price here if you want to aggregate
) as table2 --changed from allp
on (table1.product = table2.all_products) AND (table1.province = table2.all_province);

sql subquery join group by

I am trying to get a list of our users from our database along with the number of people from the same cohort as them - which in this case is defined as being from the same medical school at the same time.
medical_school_id is stored in the doctor_record table
graduation_dt is stored in the doctor_record table as well.
I have managed to write this query out using a subquery which does a select statement counting the number of others for each row but this takes forever. My logic is telling me that I ought to run a simple GROUP BY query once first and then somehow JOIN the medical_school_id on to that.
The group by query is as follows
select count(ca.id) , cdr.medical_school_id, cdr.graduation_dt
from account ca
LEFT JOIN doctor cd on ca.id = cd.account_id
LEFT JOIN doctor_record cdr on cd.gmc_number = cdr.gmc_number
GROUP BY cdr.medical_school_id, cdr.graduation_dt
The long select query is
select a.id, a.email , dr.medical_school_id,
(select count(ba.id) from account ba
LEFT JOIN doctor bd on ba.id = bd.account_id
LEFT JOIN doctor_record bdr on bd.gmc_number = bdr.gmc_number
WHERE bdr.medical_school_id = dr.medical_school_id AND bdr.graduation_dt = dr.graduation_dt) AS med_count,
from account a
LEFT JOIN doctor d on a.id = d.account_id
LEFT JOIN doctor_record dr on d.gmc_number = dr.gmc_number
If you could push me in the right direction that would be amazing
I think you just want window functions:
select a.id, a.email, dr.medical_school_id, dr.graduation_dt,
count(*) over (partition by dr.medical_school_id, dr.graduation_dt) as cohort_size
from account a left join
doctor d
on a.id = d.account_id left join
doctor_record dr
on d.gmc_number = dr.gmc_number;
Using your same code for group by:
SELECT * FROM (
(
SELECT acc.[id]
, acc.[email]
FROM
account acc
LEFT JOIN
doctor doc
ON
acc.id = doc.account_id
LEFT JOIN
doctor_record doc_rec
ON
doc.gmc_number = doc_rec.gmc_number
) label
LEFT JOIN
(
SELECT count(acco.id)
, doc_reco.medical_school_id
, doc_reco.graduation_dt
FROM
account acco
LEFT JOIN
doctor doct
ON
acco.id = doct.account_id
LEFT JOIN
doctor_record doc_reco
ON
doct.gmc_number = doc_reco.gmc_number
GROUP BY
doc_reco.medical_school_id,
doc_reco.graduation_dt
) count
ON
count.[medical_school_id]=label.[medical_school_id]
AND
count.[graduation_dt]=label.[graduation_date]
)
how about something like this?
select a.doctor_id
, count(*) - 1
from doctor_record a
left join doctor_record b on a.medical_school_id = b.medical_school_id
and a.graduation_dt = b.graduation_dt
group by a.doctor_id
Subtract 1 from the count so that you're not counting the doctor in the "other folks in same cohort" number
I'm defining "same cohort" as "same medical school & graduation date".
I'm unclear on what GMC number is and how it is related. Is it something to do with cohort?

Return rows where a customer bought things on same day

Can someone help me with the rest of my Query.
This query gives me Customer, AdressNr, Date, Employee, Article, ActivityNr
from all the sales in my Company.
SELECT ad.Name + ' ' + ad.Vorname AS Customer,
pa.Kunde AS CustomerNr,
CONVERT(VARCHAR(10),p.datum,126) AS Date,
(SELECT a.name + ' ' + a.Vorname AS Name FROM PRO_Mitarbeiter m LEFT JOIN ADR_Adressen a ON a.AdressNrADR=m.AdressNrADR WHERE m.MitNrPRO = l.MitNrPRO) as Employee,
p.Artikel_1 AS Article,
l.AufgabenNrCRM AS OrderNr
FROM ZUS_Therapie_Positionen p
INNER JOIN CRM_AufgabenLink l ON l.AufgabenNrCRM = p.Id_Aktivitaet
INNER JOIN CRM_Aufgaben ab ON ab.AufgabenNrCRM = p.Id_Aktivitaet
INNER JOIN PRO_Auftraege pa ON pa.AuftragNrPRO = ab.AuftragNrPRO
INNER JOIN ADR_Adressen ad ON ad.AdressNrADR = pa.Kunde
INNER JOIN ADR_GruppenLink gl ON gl.AdressNrADR = ad.AdressNrADR
INNER JOIN ADR_Gruppen g ON g.GruppeADR = gl.GruppeADR
WHERE l.MitNrPRO != 0
GROUP BY l.AufgabenNrCRM,ad.Name,ad.Vorname,pa.Kunde,p.datum,p.Artikel_1,l.MitNrPRO
ORDER BY pa.Kunde,p.datum,l.AufgabenNrCRM
My goal is to filter this so i get only rows back where the customer has bought more then 1 Thing on the same day. It doesn't matter if a customer bought the same Article twice on the same day. I want too see this also.
It's to complicated to write some SQL Fiddle for you but in this Picture you can see what my goal is. I want to take away all rows with an X on the left side and thoose with a Circle i want to Keep.
As I don't speak German, I won't target this specifically to your SQL. But see the following quasi-code for a similar example that you should be able to apply to your own script.
SELECT C.CustomerName, O.OrderDate, O.OrderNumber
FROM CUSTOMER C
JOIN ORDERS O ON O.Customer_ID = C.Customer_ID
JOIN
(SELECT Customer_ID, OrderDate
FROM ORDERS
GROUP BY Customer_ID, OrderDate
HAVING COUNT(*) > 1) SRC
ON SRC.Customer_ID = O.Customer_ID AND SRC.OrderDate = O.OrderDate
In the script above, the last query (a subquery) would only return results where a customer had more than one order in a given day. By joining that to your main query, you would effectively produce the result asked in the OP.
Edit 1:
Regarding your comment below, I really recommend just going over your datamodel, trying to understand what's happening here, and fixing it on your own. But there is an easy - albeit hardly optimal solution to this by just using your own script above. Note, while this is not disastrous performance-wise, it's obviously not the cleanest, most effective method either. But it should work:
;WITH CTE AS (SELECT ad.Name + ' ' + ad.Vorname AS Customer,
pa.Kunde AS CustomerNr,
CONVERT(VARCHAR(10),p.datum,126) AS [Date],
(SELECT a.name + ' ' + a.Vorname AS Name FROM PRO_Mitarbeiter m LEFT JOIN ADR_Adressen a ON a.AdressNrADR=m.AdressNrADR WHERE m.MitNrPRO = l.MitNrPRO) as Employee,
p.Artikel_1 AS Article,
l.AufgabenNrCRM AS OrderNr
FROM ZUS_Therapie_Positionen p
INNER JOIN CRM_AufgabenLink l ON l.AufgabenNrCRM = p.Id_Aktivitaet
INNER JOIN CRM_Aufgaben ab ON ab.AufgabenNrCRM = p.Id_Aktivitaet
INNER JOIN PRO_Auftraege pa ON pa.AuftragNrPRO = ab.AuftragNrPRO
INNER JOIN ADR_Adressen ad ON ad.AdressNrADR = pa.Kunde
INNER JOIN ADR_GruppenLink gl ON gl.AdressNrADR = ad.AdressNrADR
INNER JOIN ADR_Gruppen g ON g.GruppeADR = gl.GruppeADR
WHERE l.MitNrPRO != 0
GROUP BY l.AufgabenNrCRM,ad.Name,ad.Vorname,pa.Kunde,p.datum,p.Artikel_1,l.MitNrPRO
ORDER BY pa.Kunde,p.datum,l.AufgabenNrCRM)
SELECT C.*
FROM CTE C
JOIN (Select CustomerNr, [Date]
FROM CTE B
GROUP BY CustomerNr, [Date]
HAVING COUNT(*) > 1) SRC
ON SRC.CustomerNr = C.CustomerNr AND SRC.[Date] = C.[Date]
This should work directly. But as I said, this is an ugly workaround where we're basically all but fetching the whole set twice, as opposed to just limiting the sub query to just the bare minimum of necessary tables. Your choice. :)
Tried that also and it didnt work. I also made a new query trying to Keep it so simple as possible and it doesnt work either. It still give me Single values back..
SELECT p.Datum,a.AufgabenNrCRM,auf.Kunde FROM CRM_Aufgaben a
LEFT JOIN ZUS_Therapie_Positionen p ON p.Id_Aktivitaet = a.AufgabenNrCRM
LEFT JOIN PRO_Auftraege auf ON auf.AuftragNrPRO = a.AuftragNrPRO
LEFT JOIN
(SELECT pa.Datum,au.Kunde FROM CRM_Aufgaben aa
LEFT JOIN ZUS_Therapie_Positionen pa ON pa.Id_Aktivitaet = aa.AufgabenNrCRM
LEFT JOIN PRO_Auftraege au ON au.AuftragNrPRO = aa.AuftragNrPRO
GROUP BY pa.Datum,au.Kunde
HAVING COUNT(*) > 1) SRC
ON SRC.Kunde = auf.Kunde
WHERE p.datum IS NOT NULL
GROUP BY p.Datum,a.AufgabenNrCRM,auf.Kunde
ORDER BY auf.Kunde,p.Datum

Inner join that ignore singlets

I have to do an self join on a table. I am trying to return a list of several columns to see how many of each type of drug test was performed on same day (MM/DD/YYYY) in which there were at least two tests done and at least one of which resulted in a result code of 'UN'.
I am joining other tables to get the information as below. The problem is I do not quite understand how to exclude someone who has a single result row in which they did have a 'UN' result on a day but did not have any other tests that day.
Query Results (Columns)
County, DrugTestID, ID, Name, CollectionDate, DrugTestType, Results, Count(DrugTestType)
I have several rows for ID 12345 which are correct. But ID 12346 is a single row of which is showing they had a row result of count (1). They had a result of 'UN' on this day but they did not have any other tests that day. I want to exclude this.
I tried the following query
select
c.desc as 'County',
dt.pid as 'PID',
dt.id as 'DrugTestID',
p.id as 'ID',
bio.FullName as 'Participant',
CONVERT(varchar, dt.CollectionDate, 101) as 'CollectionDate',
dtt.desc as 'Drug Test Type',
dt.result as Result,
COUNT(dt.dru_drug_test_type) as 'Count Of Test Type'
from
dbo.Test as dt with (nolock)
join dbo.History as h on dt.pid = h.id
join dbo.Participant as p on h.pid = p.id
join BioData as bio on bio.id = p.id
join County as c with (nolock) on p.CountyCode = c.code
join DrugTestType as dtt with (nolock) on dt.DrugTestType = dtt.code
inner join
(
select distinct
dt2.pid,
CONVERT(varchar, dt2.CollectionDate, 101) as 'CollectionDate'
from
dbo.DrugTest as dt2 with (nolock)
join dbo.History as h2 on dt2.pid = h2.id
join dbo.Participant as p2 on h2.pid = p2.id
where
dt2.result = 'UN'
and dt2.CollectionDate between '11-01-2011' and '10-31-2012'
and p2.DrugCourtType = 'AD'
) as derived
on dt.pid = derived.pid
and convert(varchar, dt.CollectionDate, 101) = convert(varchar, derived.CollectionDate, 101)
group by
c.desc, dt.pid, p.id, dt.id, bio.fullname, dt.CollectionDate, dtt.desc, dt.result
order by
c.desc ASC, Participant ASC, dt.CollectionDate ASC
This is a little complicated because the your query has a separate row for each test. You need to use window/analytic functions to get the information you want. These allow you to do calculate aggregation functions, but to put the values on each line.
The following query starts with your query. It then calculates the number of UN results on each date for each participant and the total number of tests. It applies the appropriate filter to get what you want:
with base as (<your query here>)
select b.*
from (select b.*,
sum(isUN) over (partition by Participant, CollectionDate) as NumUNs,
count(*) over (partition by Partitipant, CollectionDate) as NumTests
from (select b.*,
(case when result = 'UN' then 1 else 0 end) as IsUN
from base
) b
) b
where NumUNs <> 1 or NumTests <> 1
Without the with clause or window functions, you can create a particularly ugly query to do the same thing:
select b.*
from (<your query>) b join
(select Participant, CollectionDate, count(*) as NumTests,
sum(case when result = 'UN' then 1 else 0 end) as NumUNs
from (<your query>) b
group by Participant, CollectionDate
) bsum
on b.Participant = bsum.Participant and
b.CollectionDate = bsum.CollectionDate
where NumUNs <> 1 or NumTests <> 1
If I understand the problem, the basic pattern for this sort of query is simply to include negating or exclusionary conditions in your join. I.E., self-join where columnA matches, but columns B and C do not:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
and t1.PkId != t2.PkId
and t1.category != t2.category
)
Put the conditions in the WHERE clause if it benchmarks better:
select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
And it's often easiest to start with the self-join, treating it as a "base table" on which to join all related information:
select
[columns]
from
(select
[columns]
from
table t1
join table t2 on (
t1.NonPkId = t2.NonPkId
)
where
t1.PkId != t2.PkId
and t1.category != t2.category
) bt
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
join [othertable] on (<whatever>)
This can allow you to focus on getting that self-join right, without interference from other tables.