High performance TSQL to retrieve data - sql

I have two tables with below structure
Person(ID, Name, ...)
Action(ID, FirstPersonId, SecondPersonId, Date)
I wanna retrieve this data for each person:
Number of action that a person be on second person from last action
that be on first person
Current query
Select Result.Id ,
(Select Count(*)
From Action
Where SecondPersonId = Result.Id
AND Date > Result.LastAction)
From
(Select ID ,
(
Select Top 1 Date
From Action
Where Action.FirstPersonId = Person.Id
) as LastAction
From Person ) As Result
this query has bad performance and i need very better one.

with lastActionPerson as -- last action for every first person
(select FirstPersonId , max([Date]) as LastActionDate
from Action
)
select a.SecondPersonId ,count(*)
from lastActionPerson lap
join Action a
on a.SecondPersonId = lap.FirstPersonId -- be on second person
and a.[Date] > lap.lastActionDate
-- you could continue to right join person table to show the person without actions
group by a.SecondPersonId

Related

Oracle SQL: GROUP BY HAVING multiple criteria

I have the following query:
SELECT PERSON_ID
FROM TABLE
WHERE YEAR > 2013
AND ACTION = 'TERM'
GROUP BY PERSON_ID
HAVING COUNT(ACTION) = 1
This query is returning PERSON_IDs with one TERM action among other actions.
How can I modify my HAVING clause to have the query return PERSON_IDs with one TERM action and no other actions? I tried moving the AND ACTION = 'TERM' below the HAVING line, but, as there is no GROUP BY operation in that line, I am getting an error.
Here is one method:
SELECT PERSON_ID
FROM TABLE
WHERE YEAR > 2013
GROUP BY PERSON_ID
HAVING COUNT(ACTION) = 1 AND MIN(ACTION) = 'TERM'

SQL: multiple counts from same table

I am having a real problem trying to get a query with the data I need. I have tried a few methods without success. I can get the data with 4 separate queries, just can't get hem into 1 query. All data comes from 1 table. I will list as much info as I can.
My data looks like this. I have a customerID and 3 columns that record who has worked on the record for that customer as well as the assigned acct manager
RecID_Customer___CreatedBy____LastUser____AcctMan
1-------1374----------Bob Jones--------Mary Willis------Bob Jones
2-------1375----------Mary Willis------Bob Jones--------Bob Jones
3-------1376----------Jay Scott--------Mary Willis-------Mary Willis
4-------1377----------Jay Scott--------Mary Willis------Jay Scott
5-------1378----------Bob Jones--------Jay Scott--------Jay Scott
I want the query to return the following data. See below for a description of how each is obtained.
Employee___Created__Modified__Mod Own__Created Own
Bob Jones--------2-----------1---------------1----------------1
Mary Willis------1-----------2---------------1----------------0
Jay Scott--------2-----------1---------------1----------------1
Created = Counts the number of records created by each Employee
Modified = Number of records where the Employee is listed as Last User
(except where they created the record)
Mod Own = Number of records for each where the LastUser = Acctman
(account manager)
Created Own = Number of Records created by the employee where they are
the account manager for that customer
I can get each of these from a query, just need to somehow combine them:
Select CreatedBy, COUNT(CreatedBy) as Created
FROM [dbo].[Cust_REc] GROUP By CreatedBy
Select LastUser, COUNT(LastUser) as Modified
FROM [dbo].[Cust_REc] Where LastUser != CreatedBy GROUP By LastUser
Select AcctMan, COUNT(AcctMan) as CreatePort
FROM [dbo].[Cust_REc] Where AcctMan = CreatedBy GROUP By AcctMan
Select AcctMan, COUNT(AcctMan) as ModPort
FROM [dbo].[Cust_REc] Where AcctMan = LastUser AND NOT AcctMan = CreatedBy GROUP By AcctMan
Can someone see a way to do this? I may have to join the table to itself, but my attempts have not given me the correct data.
The following will give you the results you're looking for.
select
e.employee,
create_count=(select count(*) from customers c where c.createdby=e.employee),
mod_count=(select count(*) from customers c where c.lastmodifiedby=e.employee),
create_own_count=(select count(*) from customers c where c.createdby=e.employee and c.acctman=e.employee),
mod_own_count=(select count(*) from customers c where c.lastmodifiedby=e.employee and c.acctman=e.employee)
from (
select employee=createdby from customers
union
select employee=lastmodifiedby from customers
union
select employee=acctman from customers
) e
Note: there are other approaches that are more efficient than this but potentially far more complex as well. Specifically, I would bet there is a master Employee table somewhere that would prevent you from having to do the inline view just to get the list of names.
this seems pretty straight forward. Try this:
select a.employee,b.created,c.modified ....
from (select distinct created_by from data) as a
inner join
(select created_by,count(*) as created from data group by created_by) as b
on a.employee = b.created_by)
inner join ....
This highly inefficient query may be a rough start to what you are looking for. Once you validate the data then there are things you can do to tidy it up and make it more efficient.
Also, I don't think you need the DISTINCT on the UNION part because the UNION will return DISTINCT values unless UNION ALL is specified.
SELECT
Employees.EmployeeID,
Created =(SELECT COUNT(*) FROM Cust_REc WHERE Cust_REc.CreatedBy=Employees.EmployeeID),
Mopdified =(SELECT COUNT(*) FROM Cust_REc WHERE Cust_REc.LastUser=Employees.EmployeeID AND Cust_REc.CreateBy<>Employees.EmployeeID),
ModOwn =
CASE WHEN NOT Empoyees.IsManager THEN NULL ELSE
(SELECT COUNT(*) FROM Cust_REc WHERE AcctMan=Employees.EmployeeID)
END,
CreatedOwn=(SELECT COUNT(*) FROM Cust_REc WHERE AcctMan=Employees.EmployeeID AND CReatedBy=Employees.EMployeeID)
FROM
(
SELECT
EmployeeID,
IsManager=CASE WHEN EXISTS(SELECT AcctMan FROM CustRec WHERE AcctMan=EmployeeID)
FROM
(
SELECT DISTINCT
EmployeeID
FROM
(
SELECT EmployeeID=CreatedBy FROM Cust_Rec
UNION
SELECT EmployeeID=LastUser FROM Cust_Rec
UNION
SELECT EmployeeID=AcctMan FROM Cust_Rec
)AS Z
)AS Y
)
AS Employees
I had the same issue with the Modified column. All the other columns worked okay. DCR example would work well with the join on an employees table if you have it.
SELECT CreatedBy AS [Employee],
COUNT(CreatedBy) AS [Created],
--Couldn't get modified to pull the right results
SUM(CASE WHEN LastUser = AcctMan THEN 1 ELSE 0 END) [Mod Own],
SUM(CASE WHEN CreatedBy = AcctMan THEN 1 ELSE 0 END) [Created Own]
FROM Cust_Rec
GROUP BY CreatedBy

Select all unique names based on most recent value in different field

I have an access database with a table called SicknessLog. The fields are ID, StaffName, [Start/Return], DateStamp.
When a member of staff is off work for sickness then a record is added to the table and the value in the [Start/Return] field is 1. When they return to work a new record is added with the same details except the [Start/Return] field is 0.
I am trying to write a query that will return all distinct staff names where the most recent record for that person has a value of 1 (ie, all staff who are still off sick)
Does anyone know if this is possible? Thanks in advance
Here's one way, all staff that has been sick where it does not exist an event after that where that staff is "nonsick":
select distinct x.staffname
from sicknesslog x
where Start/Return = 1
and not exists (
select 1
from sicknesslog y
where x.StaffName = y.StaffName
and y.DateStamp > x.DateStamp
and y.Start/Return = 0
)
You can use group by to achieve this.
select staffname ,max(datestamp) from sicknesslog where start/return = 1 group by staffname
it will return all latest recored for all staff. If ID column is autogenerated PK then you can use it in max function.
select staffname,MAX(datestamp)
from sicknesslog
where [start/return]=1
group by staffname
order by max(datestamp) desc,staffname
This will retrieve latest records who is sick and off to work
This should be close:
select s.StaffName, s.DateStamp, s.[Start/Return]
from SicknessLog s
left join (
select StaffName, max(DateStamp) as MaxDate
from SicknessLog
group by StaffName
) sm on s.StaffName = sm.StaffName and s.DateStamp = sm.MaxDate and s.[Start/Return] = 1

Optimizing a troublesome query

I'm generating a PDF, via php from 2 mysql tables, that contains a table. On larger tables the script is eating up a lot of memory and is starting to become a problem.
My first table contains "inspections." There are many rows per day. This has a many to one relationship with the user table.
Table "inspections"
id
area
inpsection_date
inpsection_agent_1
inpsection_agent_2
inpsection_agent_3
id (int)
area (varchar) - is one of 8 "areas" ie: Concrete, Soils, Earthwork
inspection_date (int) - unix timestamp
inspection_agent_1 (int) - a user id
inspection_agent_2 (int) - a user id
inspection_agent_3 (int) - a user id
Second table is the user's info. All I need is to join the name to the "inspection_agents_x"
id
name
The final table, that is going to be in the PDF, needs to organize the data by:
by day
by user, find every "area" that the user "inspected" on that day
Concrete
Soils
Earthwork
1/18/2011
Jon Doe
X
Jane Doe
X
X
And so on for each day. Right now I'm just doing a simple join on the names and then organizing everything on the code end. I know I'm leaving a lot on the table as far as the queries go, I just can't think of way to do it.
Thanks for any and all help.
Select U.name
, user_inspections.inspection_date
, Min( Case When user_inspections.area = 'Concrete' Then 'X' End ) As Concrete
, Min( Case When user_inspections.area = 'Soils' Then 'X' End ) As Soils
, Min( Case When user_inspections.area = 'Earthwork' Then 'X' End ) As Earthwork
From users As U
Join (
Select area, inspection_date, inspection_agent1 As user_id
From inspections
Union All
Select area, inspection_date, inspection_agent2 As user_id
From inspections
Union All
Select area, inspection_date, inspection_agent3 As user_id
From inspections
) As user_inspections
On user_inspections.user_id = U.id
Group By U.name, user_inspections.inspection_date
This is effectively a static crosstab. It means that you will need to know all areas that should be outputted in the query at design time.
One of the reasons this query is problematic is that your schema is not normalized. Your inspection table should look like:
Create Table inspections
(
id int...
, area varchar...
, inspection_date date ...
, inspection_agent int References Users ( Id )
)
That would avoid the inner Union All query to get the output you want.
I would go like this:
select i.*, u1.name, u2.name, u3.name from inspections i left join users u1 on (i.inspection_agent_id1 = u1.id) left join users u2 on (i.inspection_agent_id2 = u2.id) left join users u3 on (i.inspection_agent_id3 = u3.id) order by i.inspection_date asc;
Then select distinct areas names and remember them or fetch them from area table if you have any:
select distinct area from inspections;
Then it's just foreach:
$day = "";
foreach($inspection in $inspections)
{
if($day == "" || $inspection["inspection_date"] != $day)
{
//start new row with date here
}
//start standard row with user name
}
It isn't clear if you have to display all users each time ( even if some of them do not do inspection that thay), if you have to you should fetch users once and loop over $users and search for user in $inspection row.

SQL Server 2008: Using Multiple dts Ranges to Build a Set of Dates

I'm trying to build a query for a medical database that counts the number of patients that were on at least one medication from a class of medications (the medications listed below in the FAST_MEDS CTE) and had either:
1) A diagnosis of myopathy (the list of diagnoses in the FAST_DX CTE)
2) A CPK lab value above 1000 (the lab value in the FAST_LABS CTE)
and this diagnosis or lab happened AFTER a patient was on a statin.
The query I've included below does that under the assumption that once a patient is on a statin, they're on a statin forever. The first CTE collects the ids of patients that were on a statin along with the first date of their diagnosis, the second those with a diagnosis, and the third those with a high lab value. After this I count those that match the above criteria.
What I would like to do is drop the assumption that once a patient is on a statin, they're on it for life. The table edw_dm.patient_medications has a column called start_dts and end_dts. This table has one row for each prescription written, with start_dts and end_dts denoting the start and end date of the prescription. End_dts could be null, which I'll take to assume that the patient is currently on this medication (it could be a missing record, but I can't do anything about this). If a patient is on two different statins, the start and ends dates can overlap, and there may be multiple records of the same medication for a patient, as in a record showing 3-11-2000 to 4-5-2003 and another for the same patient showing 5-6-2007 to 7-8-2009.
I would like to use these two columns to build a query where I'm only counting the patients that had a lab value or diagnosis done during a time when they were already on a statin, or in the first n (say 3) months after they stopped taking a statin. I'm really not sure how to go about rewriting the first CTE to get this information and how to do the comparison after the CTEs are built. I know this is a vague question, but I'm really stumped. Any ideas?
As always, thank you in advance.
Here's the current query:
WITH FAST_MEDS AS
(
select distinct
statins.mrd_pt_id, min(year(statins.order_dts)) as statin_yr
from
edw_dm.patient_medications as statins
inner join mrd.medications as mrd
on statins.mrd_med_id = mrd.mrd_med_id
WHERE mrd.generic_nm in (
'Lovastatin (9664708500)',
'lovastatin-niacin',
'Lovastatin/Niacin',
'Lovastatin',
'Simvastatin (9678583966)',
'ezetimibe-simvastatin',
'niacin-simvastatin',
'ezetimibe/Simvastatin',
'Niacin/Simvastatin',
'Simvastatin',
'Aspirin Buffered-Pravastatin',
'aspirin-pravastatin',
'Aspirin/Pravastatin',
'Pravastatin',
'amlodipine-atorvastatin',
'Amlodipine/atorvastatin',
'atorvastatin',
'fluvastatin',
'rosuvastatin'
)
and YEAR(statins.order_dts) IS NOT NULL
and statins.mrd_pt_id IS NOT NULL
group by statins.mrd_pt_id
)
select *
into #meds
from FAST_MEDS
;
--return patients who had a diagnosis in the list and the year that
--diagnosis was given
with
FAST_DX AS
(
SELECT pd.mrd_pt_id, YEAR(pd.init_noted_dts) as init_yr
FROM edw_dm.patient_diagnoses as pd
inner join mrd.diagnoses as mrd
on pd.mrd_dx_id = mrd.mrd_dx_id
and mrd.icd9_cd in
('728.89','729.1','710.4','728.3','729.0','728.81','781.0','791.3')
)
select *
into #dx
from FAST_DX;
--return patients who had a high cpk value along with the year the cpk
--value was taken
with
FAST_LABS AS
(
SELECT
pl.mrd_pt_id, YEAR(pl.order_dts) as lab_yr
FROM
edw_dm.patient_labs as pl
inner join mrd.labs as mrd
on pl.mrd_lab_id = mrd.mrd_lab_id
and mrd.lab_nm = 'CK (CPK)'
WHERE
pl.lab_val between 1000 AND 999998
)
select *
into #labs
from FAST_LABS;
-- count the number of patients who had a lab value or a medication
-- value taken sometime AFTER their initial statin diagnosis
select
count(distinct p.mrd_pt_id) as ct
from
mrd.patient_demographics as p
join #meds as m
on p.mrd_pt_id = m.mrd_pt_id
AND
(
EXISTS (
SELECT 'A' FROM #labs l WHERE p.mrd_pt_id = l.mrd_pt_id
and l.lab_yr >= m.statin_yr
)
OR
EXISTS(
SELECT 'A' FROM #dx d WHERE p.mrd_pt_id = d.mrd_pt_id
AND d.init_yr >= m.statin_yr
)
)
You probably don't need to select all of your CTE defined queries into temp tables.
I think that the query you're after has the form:
WITH FAST_MEDS(PatientID, StartDate, EndDate) AS
(
--your query for patients on statins, projecting the patient ID and the start/end date for the medication
),
FAST_DX(PatientID, Date) AS
(
--your query for patients with certain diagnosis, projecting the patient ID and the date
),
FAST_LABS(PatientID, Date) AS
(
--your query for patients with certain labs, projecting the patient ID and the date
)
SELECT PatientID
FROM FAST_MEDS
WHERE PatientID IN (SELECT PatientID FROM FAST_DX WHERE Date BETWEEN StartDate AND EndDate OR EndDate IS NULL AND StartDate < Date)
OR PatientID IN (SELECT PatientID FROM FAST_LABS WHERE Date BETWEEN StartDate AND EndDate OR EndDate IS NULL AND StartDate < Date)