how to perform date calculations from different tables? - sql

Please forgive me if this is a basic question, I'm a beginner in SQL and need some help performing date calculations from 2 tables in SQL.
I have two tables (patient and chd) they look like this:
Patient:
ID|Age|date |Alive
--------------------------
1 50 01/09/2013 Y
2 52 11/05/2015 N
3 19 20/07/2016 N
CHD:
ID|Age|indexdate
--------------------
1 50 01/08/2012
2 52 11/11/2013
3 19 10/07/2015
The patient table contains about 500,000 records from 2010-2016 and the CHD table contains about 350,000 records from 2012-2013. What I want to do is see how many CHD patients have died from 2012-2016, and if they have died has 12months passed?
I'm not sure how to do this but I know a join is needed on the ID and we set the where condition with alive as NOT 'Y'
The final output should look like this based on the sample above:
ID|Age|indexdate| deathdate
---------------------------
2 52 11/11/2013 11/05/2015
3 19 10/07/2016 20/07/2016
Any questions let me know!
EDIT: just to make it clear, patients can appear multiple times in the patient table until they die.
Thanks

Let me assume that this query gets the date of death from the patient table:
select p.id, min(p.date) as deathdate
from patient p
where p.Alive = 'N'
group by p.id;
Then, you can get what you want with a join:
select count(*)
from chd c join
(select p.id, min(p.date) as deathdate
from patient p
where p.Alive = 'N'
group by p.id
) pd
on c.id = pd.id;
You can then address your questions with a where clause in the outer query. For instance:
where deathdate >= current_date - interval '1 year'

Related

Loop through rows and match values in SQL

appreciate any help with my problem! I have an org chart of all employees and then columns for their supervisors. I am trying to find the first in the org structure supervisor for each employee that has 3+ years' experience. So if supervisor 1 has only 1 year, I will need to move to the next column with super visor 2 and see if they have more experience. At the end, I would like to return a column of supervisors' ids [experienced_supervisor column]
Table: org_chart
id | experience | supervisor_id_1| supervisor_id_2 | experienced_supervisor
A | 2 | X | C | X
C | 5 | V | D | D
V | 1 | M | X | M
X | 3
D | 8
M | 11
I am new to SQL and not even sure if this is the best approach. But here is my thinking: I will use CASE to look though every row (employee) and compare their supervisor's experience.
SELECT CASE
WHEN experience >=3 THEN supervisor_id_1
ELSE
CASE WHEN experience >=3 THEN supervisor_id_2
ELSE 'not found'
END AS experienced_supervisor
FROM org_chart
Questions:
Is this the best way to tackle the problem?
Can I look up the value [experience years] of supervisors by matching supervisor_id_1, supervisor_id_2 to id? Or do I need to create a new column supervisor_id_1_experience and fill the years of experience by doing the join?
I am using Redshift.
You only need one case expression, but a lot of joins or subqueries. Perhaps
SELECT (CASE WHEN (SELECT oc2.experience >=3 FROM org_chart oc2 WHERE oc2.id = supervisor_id_1) >= 3
THEN supervisor_id_1
WHEN (SELECT oc2.experience >=3 FROM org_chart oc2 WHERE oc2.id = supervisor_id_2) >= 3
THEN supervisor_id_2
. . .
END) AS experienced_supervisor
FROM org_chart oc
After lots of trial and errors here is the result that worked for my problem. I am using Redshift in this case.
-- Use common table expression to find job level for each supervisor from reporting levels 8 to 2
WITH cte1 AS
(
SELECT B.id as employee
,B.experience as employee_experience
,B.supervisor_id_1 as manager_1
,A.experience as supervisor_1_experience
FROM org_chart
INNER JOIN org_chart B ON B.supervisor_id_1 = A.id
),
cte2 AS
(
SELECT B.id as employee2
,B.experience as employee_experience
,B.supervisor_id_2 as manager_2
,A.experience as supervisor_2_experience
FROM org_chart
INNER JOIN org_chart B ON B.supervisor_id_2 = A.id
),
........-- Write as many statements as I have columns with reporting levels
-- Join all tables above
cte3 AS
(
SELECT employee
,employee_experience
,manager_1
,supervisor_1_experience
,manager_2
,supervisor_2_experience
FROM cte1
JOIN cte2 ON cte2.employee2 = cte1.employee
....... -- Write as many statements as I have columns with reporting levels
)
-- Run through every row and evaluate if each supervisor has more than 3 years of experience
SELECT *
,CASE
WHEN cte3.supervisor_1_experience >= 3 THEN cte3.manager_1
WHEN cte3.supervisor_1_experience < 3
AND cte3.supervisor_2_experience >=3
THEN cte3.manager_2
........ -- Write as many statements as I have columns with reporting levels
END experienced_supervisor
FROM cte3

Fill missing dates in PostgreSQL with zero

I've a query like this in PostgreSQL:
select count(id_student) students, date_beginning_course from
data.sessions_courses
left join my_schema.students on id_session_course=id_sesion
where course_name='First course'
group by date_beginning_course
What I obtain with this query is the number of students that have attended a session of "First course" in several dates, for example:
Students Date_beginning_course
____________________________________
5 2019-06-26
1 2019-06-28
5 2019-06-30
6 2019-07-01
2 2019-07-02
I'd like to fill this table with the missing date values, and, for each missing value, assign a '0' in Students column, because there are no students for this date. Example:
Students Date_beginning_course
____________________________________
5 2019-06-26
0 2019-06-27 <--new row
1 2019-06-28
0 2019-06-29 <--new row
5 2019-06-30
6 2019-07-01
2 2019-07-02
Could you help me? Thanks! :)
You could generate a list of dates with the handy Postgres set-returning function generate_series() and LEFT JOIN it with the sessions_courses and students table:
SELECT
COUNT(s.id_student) students,
d.dt
FROM
(
SELECT dt::date
FROM generate_series('2019-06-26', '2019-07-02', '1 day'::interval) dt
) d
LEFT JOIN data.sessions_courses c
ON c.date_beginning_course = d.dt
AND c.course_name='First course'
LEFT JOIN my_schema.students s
ON s.id_session_course = c.id_session
GROUP BY d.dt
You can change the date range by modifying the first two parameters of generate_series().
NB: it is a general good practive to index the column names in the query with the relevant table names (or table alias), so it is explicit to which table each column belongs. I changed your query accordingly, and had to make a few assumptions, that you might need to adapt.

SQL date calculations

need some help doing SQL date calculations:
In a table I have patients who are older than 18 and died from a certain disease (table a). In another table I have Patients of the same disease and the earliest date they were diagnosed with this disease (table b).
What i need to know is if 12 months has passed since they were diagnosed and when they died.
Can someone assist me in performing this date calculation.
The column in table a for date is indexdate and column is deathdate in table b for when they died.
Appreciate any help
Table A:
patientid--age--deathdate
1 20 11/05/2016
2 19 10/09/2015
Table B:
PatientID--indexdate
1 01/02/2015
2 08/03/2014
So essentially all i want to check is if 12 months has passed between indexdate and deathdate.
This gives list of patients for whom 12 months passed since they were diagnosed and when they died.
SELECT A.patientID, A.patientName
FROM tableA A
INNER JOIN
(
SELECT patientID , MIN(DiagnoseDate) As EarliestDate
FROM tableB
GROUP BY patientID
) As B
ON A.patientID = B.patientID
WHERE date_part('month',age(EarliestDate, DeathDate)) >=12
You should be able to do that by writing a query that links the 2 tables by the patient id, then using the dateadd function in the where clause, which would be something like this example:
WHERE TableA.deathdate > (DATEADD(month, 12, TableB.indexdate))

Only joining rows where the date is less than the max date in another field

Let's say I have two tables. One table containing employee information and the days that employee was given a promotion:
Emp_ID Promo_Date
1 07/01/2012
1 07/01/2013
2 07/19/2012
2 07/19/2013
3 08/21/2012
3 08/21/2013
And another table with every day employees closed a sale:
Emp_ID Sale_Date
1 06/12/2013
1 06/30/2013
1 07/15/2013
2 06/15/2013
2 06/17/2013
2 08/01/2013
3 07/31/2013
3 09/01/2013
I want to join the two tables so that I only include sales dates that are less than the maximum promotion date. So the result would look something like this
Emp_ID Sale_Date Promo_Date
1 06/12/2013 07/01/2012
1 06/30/2013 07/01/2012
1 06/12/2013 07/01/2013
1 06/30/2013 07/01/2013
And so on for the rest of the Emp_IDs. I tried doing this using a left join, something to the effect of
left join SalesTable on PromoTable.EmpID = SalesTable.EmpID and Sale_Date
< max(Promo_Date) over (partition by Emp_ID)
But apparently I can't use aggregates in joins, and I already know that I can't use them in the where statement either. I don't know how else to proceed with this.
The maximum promotion date is:
select emp_id, max(promo_date)
from promotions
group by emp_id;
There are various ways to get the sales before that date, but here is one way:
select s.*
from sales s
where s.sales_date < (select max(promo_date)
from promotions p
where p.emp_id = s.emp_id
);
Gordon's answer is right on! Alternatively, you could also do a inner join to a subquery to achieve your desired output like this:
SELECT s.emp_id
,s.sales_date
,t.promo_date
FROM sales s
INNER JOIN (
SELECT emp_id
,max(promo_date) AS promo_date
FROM promotions
GROUP BY emp_id
) t ON s.emp_id = t.emp_id
AND s.sales_date < t.promo_date;
SQL Fiddle Demo

Identify open cases for each week during a year

I am trying to produce a report which identifies client cases which were open during each week of a year. Currently I have the following SQL which returns all clients with an indicator on whether their case was open during week 1 of our calendar. A client has two aspects which identifies if their case is open - their MOV_START_DATE and their ESU_START DATE should be greater than end date of the period, and their MOV_END_DATE/ESU_START DATE should be either null or greater than the start date of the period.
The below code works, but I thought I could just copy the left join WK1 and rename it WK2 to return information for week 2 but I'm getting an error relating to ambiguously named columns. Additionally, I'm guessing that having 52 (one for each week) left joins on a report isn't particularly advisable, so again I'm wondering if there is a better way of achieving this?
SELECT
A.ESU_PER_GRO_ID,
A.ESU_ID,
A.STATUS,
B.MOV_ID,
B.MOV_START_DATE,
B.MOV_END_DATE,
A.ESU_START_DATE,
A.ESU_END_DATE,
LS.CLS_DESC,
nvl2(wk1.PRD_PERIOD_NUM,'Y','N') as "Week1"
FROM
A
LEFT JOIN B ON B.MOV_PER_GRO_ID = A.ESU_PER_GRO_ID
LEFT JOIN LS ON LS.CLS_CODE = A.STATUS
LEFT JOIN O_PERIODS WK1 ON B.MOV_START_DATE < WK1.PRD_END_DATE
AND (B.MOV_END_DATE IS NULL OR B.MOV_END_DATE > WK1.PRD_START_DATE)
AND A.ESU_START_DATE < WK1.PRD_END_DATE
AND (A.ESU_END_DATE IS NULL OR A.ESU_END_DATE > WK1.PRD_START_DATE)
AND PRD_CAL_ID = 'E1190' AND WK1.PRD_PERIOD_NUM = 1 AND WK1.PRD_YEAR = 2012
WHERE
B.MOV_START_DATE Is Not Null
AND A.STATUS <> ('X')
Hopefully I have provided enough information, but if not, I am happy to answer questions. Thanks!
Sample Data (Produced by above query)
P ID ESU_ID STATUS MOV_ID M_START M_END DESC Week1
1 ESU1 New 1M 01/01/2012 Boo Y
2 ESU2 New 2M 01/03/2012 Boo N
Desired output (Week1 - Week 52)
P ID ESU_ID STATUS MOV_ID M_START M_END DESC Week1 Week2
1 ESU1 New 1M 01/01/2012 Boo Y Y
2 ESU2 New 2M 01/03/2012 Boo N N
I suspect that the reason creating a WK2 join like WK1 didn't work was that the column PRD_CAL_ID didn't have a table alias on it. However, as you guessed, 52 joins is probably not going to perform very well. Try the following:
SELECT A.ESU_PER_GRO_ID,
A.ESU_ID,
A.STATUS,
B.MOV_ID,
B.MOV_START_DATE,
B.MOV_END_DATE,
A.ESU_START_DATE,
A.ESU_END_DATE,
LS.CLS_DESC,
'Week' || TRIM(TO_CHAR(pd.PRD_PERIOD_NUM)) WEEK_DESC
FROM A
LEFT JOIN B
ON B.MOV_PER_GRO_ID = A.ESU_PER_GRO_ID
LEFT JOIN LS
ON LS.CLS_CODE = A.STATUS
LEFT JOIN O_PERIODS pd
ON B.MOV_START_DATE < pd.PRD_END_DATE AND
(B.MOV_END_DATE IS NULL OR
B.MOV_END_DATE > pd.PRD_START_DATE) AND
A.ESU_START_DATE < pd.PRD_END_DATE AND
(A.ESU_END_DATE IS NULL OR
A.ESU_END_DATE > pd.PRD_START_DATE)
WHERE B.MOV_START_DATE Is Not Null AND
A.STATUS <> ('X') AND
pd.PRD_CAL_ID = 'E1190' AND
pd.PRD_YEAR = 2012
ORDER BY WEEK_DESC
This produces slightly different results than your original query, having a WEEK_DESC instead of trying to create 52 different columns, one for each week, but I think it will perform better.
Share and enjoy.