I'm quite new to SQL and working on a query that has been thoroughly defeating me for a while now. I come to this site often - it's a terrific resource thanks to all of your expertise, and generally I find what I need, but this time I think my query is a bit too specific and I've not found something applicable. Could someone give me a hand, please?
I have two tables: one Client table and one Contact (aka appointment) table. What I need to find are all of the clients' most recent appointment days (before a certain date, in this case '11/08/2015') where the Outcome for any appointment on that day is NOT a '2'. Each client may have more than one appointment on a single day, and an Outcome of '2' for any of those appointments means that we have to ignore the whole day and move back to the next most recent day..
For example, Client '3' should have a returned appointment date of '01/07/2015' (and just one row) and not the two rows for '16/07/2015', because one of the appointments on '16/07/2015' had an Outcome of '2'. All other values for Outcome are acceptable (including NULL), just not '2'.
The multiple appointment on the same day bit is the part that I'm finding tricky - I can find the latest appointment day using a Select MAX (or TOP 1) statement, but when I add on a "<> '2'" it still continues to return the same days that may have an Outcome '2' because other appointments on that same day have another Outcome. I've been trying to play around with my tables and GROUP BY and NOT EXIST, but I don't seem to be making any headway.
Contact
ClientID AppDate Outcome
1 30/07/2015 17:00 2
1 01/07/2015 17:00 3
2 03/03/2015 16:00 NULL
2 01/03/2015 16:00 NULL
3 16/07/2015 15:40 6
3 16/07/2015 15:40 2
3 01/07/2015 15:40 3
4 05/08/2015 12:30 6
4 05/08/2015 12:30 2
4 01/08/2015 12:30 3
5 23/07/2015 15:30 2
5 23/07/2015 15:30 NULL
5 01/07/2015 15:30 4
6 20/07/2015 10:10 NULL
6 20/07/2015 10:10 2
6 01/07/2015 10:10 6
7 23/07/2015 15:40 2
7 01/07/2015 15:40 1
7 23/06/2015 15:40 8
8 13/07/2015 11:30 2
8 13/07/2015 11:30 6
8 01/07/2015 11:30 2
8 01/06/2015 11:30 3
9 29/07/2015 17:00 3
9 29/07/2015 17:00 6
10 14/07/2015 11:00 NULL
10 01/07/2015 11:00 5
Client
ClientID Forename Surname
1 I B
2 J B
3 S C
4 S T
5 P C
6 K D
7 P E
8 P H
9 S F
10 A G
Apologies if I'm missing something glaringly obvious! Thanks for reading and for any responses. I attach my truncated query for your general amusement...
SELECT
cli.ClientID ,
cli.Forename ,
cli.Surname ,
con.AppDate ,
con.Outcome
FROM
Client AS cli
INNER JOIN
Contact AS con
ON cli.ClientID = con.ClientID
AND con.AppDate =
(SELECT MAX(con1.AppDate)
FROM Contact AS con1
WHERE con.ClientID = con1.ClientID
AND con1.AppDate < '11/08/2015 00:00:00'
AND con1.Outcome <> '2')
ORDER BY
cli.ClientID
EDIT:
Thank you to Mr Linoff for the Cross Apply query, it worked perfectly.
Sorry that I didn't include the expected output earlier. For reference (for anyone else working with a similar problem in future) I was looking to obtain:
Appointments
Client ID Act Date and Time Outcome
1 01/07/2015 17:00 3
2 03/03/2015 16:00 NULL
3 01/07/2015 15:40 3
4 01/08/2015 12:30 3
5 01/07/2015 15:30 4
6 01/07/2015 10:10 6
7 01/07/2015 15:40 1
8 01/06/2015 11:30 3
9 29/07/2015 17:00 3
9 29/07/2015 17:00 6
10 14/07/2015 11:00 NULL
I think cross apply is the best approach to this:
select c.*, con.*
from client c cross apply
(select top 1 con.*
from (select con.*,
sum(case when Outcome = 2 then 1 else 0 end) over (partition by ClientId, AppDate) as num2s
from contact con
where con.ClientId = c.ClientId and
con.AppDate < '2015-11-08'
) con
where num2s = 0
order by AppDate desc
) con;
In this case, cross apply works a lot like a correlated subquery, but you can return multiple values. The subquery uses window functions to count the number of "2" on a given day and the rest of the logic should be pretty obvious.
This returns one row from the most recent date with appropriate appointments. If you want multiple such rows, use with ties.
What you need to do is to have a condition which picks out all the Dates where Outcome is two and filter them out.
Something like this:
WITH ClientCTE AS
(
SELECT MAX(con1.AppDate) AS AppDate ,ClientID
FROM Contact AS con1
WHERE con.ClientID = con1.ClientID
AND con1.AppDate < '11/08/2015 00:00:00'
AND con1.Outcome <> '2'
AND con1.AppDate NOT IN (SELECT AppDate FROM Contact WHERE Outcome = '2'))
SELECT
* FROM Client C
INNER JOIN
Contact AS con
ON cli.ClientID = con.ClientID
INNER JOIN ClientCTE CTE
ON cli.ClientID = CTE.ClientCTE
AND CTE.AppDate = con.AppDate
Let me know if it works
Please try the below query.
It uses the MAX(case...) OVER( partition BY CAST(AppDate as DATE) windowed function to flag all appointments which have a 2 outcome on same day along with WHERE clause to filter out dates.
This flag is then used in outer WHERE clause to remove unwanted data and joined on contacts and client tables
SELECT
cli.ClientID ,
cli.Forename ,
cli.Surname ,
con.AppDate ,
con.Outcome
from
client cli right join contact con
on con.ClientID=cli.ClientID
inner join
(
select ClientID,MAX(AppDate)as lastDate from
( select *,
MAX(CASE when ISNULL(Outcome,1) =2 then 1 else 0 end) OVER(PARTITION BY CAST(AppDate as DATE),ClientID ORDER BY AppDate) as flag
from Contact
where AppDate< '2015-11-08'
) p
where flag =0 group by ClientID
) s
on s.ClientId=con.ClientID and s.lastDate=con.AppDate
Also here's the link to sql fiddle for demo:
http://sqlfiddle.com/#!6/c519d/2
Related
I have a table in SQL Server about how people going in and out of building.
user_id
datetime
direction
1
27.09.2022 10:30
in
1
27.09.2022 12:30
out
1
27.09.2022 14:30
in
1
27.09.2022 15:35
out
2
27.09.2022 11:30
in
2
27.09.2022 13:20
out
2
27.09.2022 15:00
in
2
27.09.2022 15:40
out
3
27.09.2022 11:45
in
3
27.09.2022 11:46
in
3
27.09.2022 15:40
out
3
27.09.2022 15:47
in
3
27.09.2022 18:00
out
I need to calculate how much time each user spent inside the building by days.
For example, on 27th Sep user #1 spent 3 hours 5 minutes. User #2 spent 2 hours 30 minutes.
There is also a bug that may spoil the results - sometimes I may have two 'in' or two 'out' in a row, like in case of user #3. I understand the nature of such bug, and know I only have to keep last of two same rows (in fact user #3 entered in 11:46, not 11:45). Does anyone have an idea how to solve that?
select user_id
,sum(time_spent) as time_spent_minutes
from (
select *
,datediff(minute, lag(case when direction = 'in' then datetime end) over(partition by user_id order by datetime), datetime) as time_spent
from t
) t
group by user_id
user_id
time_spent_minutes
1
185
2
150
Fiddle
The window functions would be a nice fit here.
Example or Updated dbFiddle
Select user_id
,Duration = convert(time(0),dateadd(second,sum(Secs),0))
From (
Select user_id
,Secs = datediff(second,case when direction ='in'
and lead([direction],1) over (partition by user_id order by datetime)='out'
then [datetime]
end
,lead([datetime],1) over (partition by user_id order by datetime))
From YourTable
) A
Group By user_id
Results
user_id Duration
1 03:05:00 -- << Check your desired results
2 02:30:00
3 06:07:00
Suppose I have patient admission/claim wise data like the sample below. Data type of patient_id and hosp_id columns is VARCHAR
Table name claims
rec_no
patient_id
hosp_id
admn_date
discharge_date
1
1
1
01-01-2020
10-01-2020
2
2
1
31-12-2019
11-01-2020
3
1
1
11-01-2020
15-01-2020
4
3
1
04-01-2020
10-01-2020
5
1
2
16-01-2020
17-01-2020
6
4
2
01-01-2020
10-01-2020
7
5
2
02-01-2020
11-01-2020
8
6
2
03-01-2020
12-01-2020
9
7
2
04-01-2020
13-01-2020
10
2
1
31-12-2019
10-01-2020
I have another table wherein bed strength/max occupancy strength of hospitals are stored.
table name beds
hosp_id
bed_strength
1
3
2
4
Expected Results I want to find out hospital-wise dates where its declared bed-strength has exceeded on any day.
Code I have tried Nothing as I am new to SQL. However, I can solve this in R with the following strategy
pivot_longer the dates
tidyr::complete() missing dates in between
summarise or aggregate results for each date.
Simultaneously, I also want to know that whether it can be done without pivoting (if any) in sql because in the claims table there are 15 million + rows and pivoting really really slows down the process. Please help.
You can use generate_series() to do something very similar in Postgres. For the occupancy by date:
select c.hosp_id, gs.date, count(*) as occupanyc
from claims c cross join lateral
generate_series(admn_date, discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date;
Then use this as a subquery to get the dates that exceed the threshold:
select hd.*, b.strength
from (select c.hosp_id, gs.date, count(*) as occupancy
from claims c cross join lateral
generate_series(c.admn_date, c.discharge_date, interval '1 day') gs(date)
group by c.hosp_id, gs.date
) hd join
beds b
using (hosp_id)
where h.occupancy > b.strength
Need your assistance on finding the missing dates from records, sample below
Currently, i've data for 1, 2, 6 and 10 Jan 2020
select p.effective_date,x.xref_security_id,x.xref_type
from securitydbo.price p
inner join securitydbo.xreference x on x.security_alias = p.security_alias
where p.src_intfc_inst = 253
and p.effective_date between ('01-JAN-2020') and ('10-JAN-2020')
and x.xref_security_id = 'ABC999999999'
Expected Results
Missing_Date Xref_Security_ID Xref_Type Price
1/3/2020 ABC99999999 ISIN 0
1/7/2020 ABC99999999 ISIN 0
1/8/2020 ABC99999999 ISIN 0
1/9/2020 ABC99999999 ISIN 0
I don't have your tables so I created one which looks like result you currently have:
SQL> select * From test order by missing_date;
MISSING_DA XREF_S
---------- ------
01/03/2020 ABC999
01/07/2020 ABC999
01/08/2020 ABC999
01/09/2020 ABC999
In order to get dates that are missing, create a calendar (see the CTE I used, which is just one of row generator techniques) whose
starting date is lower date from your period
add level to it
connect by clause "loops" as many times as there are days in desired period
XREF_SECURITY_ID is NULL for missing dates as there's no match for them in your tables.
SQL> with
2 -- create a calendar for desired period (see CONNECT BY)
3 calendar as
4 (select date '2020-01-01' + level - 1 datum
5 from dual
6 connect by level <= date '2020-01-10' - date '2020-01-01' + 1
7 )
8 -- outer join calendar with your table(s)
9 select c.datum, t.xref_security_id
10 from calendar c left join test t on t.missing_date = c.datum
11 order by c.datum;
DATUM XREF_S
---------- ------
01/01/2020
01/02/2020
01/03/2020 ABC999
01/04/2020
01/05/2020
01/06/2020
01/07/2020 ABC999
01/08/2020 ABC999
01/09/2020 ABC999
01/10/2020
10 rows selected.
SQL>
I can take a guess that the date_format might be the problem out here. Without actually knowing what is the data in your tables the only way to do is to guess.
select p.effective_date,x.xref_security_id,x.xref_type
from securitydbo.price p
inner join securitydbo.xreference x on x.security_alias = p.security_alias
where p.src_intfc_inst = 253
and p.effective_date between to_date('01-JAN-2020','DD-MON-YYY')
and to_date('10-JAN-2020','DD-MON-YYYY')
and x.xref_security_id = 'ABC999999999'
I'm facing a challenging task here, spent a day on it and I was only able to solve it through a procedure but it is taking too long to run for all projects.
I would like to solve it in a single query if possible (no functions or procedures).
There is already some questions here doing it in programming languages OR sql functions/procedures (Wich I also solved min). So I'm asking if it is possible to solve it with just SQL
The background info is:
A project table
A phase table
A holiday table
A dayexception table which cancel a holiday or a weekend day (make that date as a working day) and it is associated with a project
A project may have 0-N phases
A phase have a start date, a duration and a draworder (needed by the system)
Working days is all days that is not weekend days and not a holiday (exception is if that date is in dayexception table)
Consider this following scenario:
project | phase(s) | Dayexception | Holiday
id | id pid start duration draworder | pid date | date
1 | 1 1 2014-01-20 10 0 | 1 2014-01-25 | 2014-01-25
| 2 1 2014-02-17 14 2 | |
The ENDDATE for the project id 1 and phase id 1 is actually 2014-01-31 see the generated data below:
The date on the below data (and now on) is formatted as dd/mm/yyyy (Brazil format) and the value N is null
proj pha start day weekday dayexcp holiday workday
1 1 20/01/2014 20/01/2014 2 N N 1
1 1 20/01/2014 21/01/2014 3 N N 1
1 1 20/01/2014 22/01/2014 4 N N 1
1 1 20/01/2014 23/01/2014 5 N N 1
1 1 20/01/2014 24/01/2014 6 N N 1
1 1 20/01/2014 25/01/2014 7 25/01/2014 25/01/2014 1
1 1 20/01/2014 26/01/2014 1 N N 0
1 1 20/01/2014 27/01/2014 2 N 27/01/2014 0
1 1 20/01/2014 28/01/2014 3 N N 1
1 1 20/01/2014 29/01/2014 4 N N 1
To generate the above data I created a view daysOfYear with all days from 2014 and 2015 (it can be bigger or smaller, created it with two years for the year turn cases) with a CTE query if you guys want to see it let me know and I will add it here. And the following select statement:
select ph.project_id proj,
ph.id phase_id pha,
ph.start,
dy.curday day,
dy.weekday, /*weekday here is a calling to the weekday function of db2*/
doe.exceptiondate dayexcp,
h.date holiday,
case when exceptiondate is not null or (weekday not in (1,7) and h.date is null)
then 1 else 0 end as workday
from phase ph
inner join daysofyear dy
on (year(ph.start) = dy.year)
left join dayexception doe
on (ph.project_id = doe.project_id
and dy.curday = truncate(doe.exceptiondate))
left join holiday h
on (dy.curday = truncate(h.date))
where ph.project_id = 1
and ph.id = 1
and dy.year in (year(ph.start),year(ph.start)+1)
and dy.curday>=ph.start
and dy.curday<=ph.start + ((duration - 1) days)
order by ph.project_id, start, dy.curday, draworder
To solve this scenario I created the following query:
select project_id,
min(start),
max(day) + sum(case when workday=0 then 1 else 0 end) days as enddate
from project_phase_days /*(view to the above select)*/
This will return correctly:
proj start enddate
1 20/01/2014 31/01/2014
The problem I couldn't solve is if the days I'm adding (non workdays sum(case when workday=0 then 1 else 0 end) days ) to the last enddate (max(day)) is weekend days or holidays or exceptions.
See the following scenario (The duration for the below phase is 7):
proj pha start day weekday dayexcp holiday workday
81 578 14/04/2014 14/04/2014 2 N N 1
81 578 14/04/2014 15/04/2014 3 N N 1
81 578 14/04/2014 16/04/2014 4 N N 1
81 578 14/04/2014 17/04/2014 5 N N 1
81 578 14/04/2014 18/04/2014 6 N 18/04/2014 0
81 578 14/04/2014 19/04/2014 7 N 0
81 578 14/04/2014 20/04/2014 1 N 20/04/2014 0
/*the below data I added to show the problem*/
81 578 14/04/2014 21/04/2014 2 N 21/04/2014 0
81 578 14/04/2014 22/04/2014 3 N 1
81 578 14/04/2014 23/04/2014 4 N 1
81 578 14/04/2014 24/04/2014 5 N 1
With the above data my query will return
proj start enddate
81 14/04/2014 23/04/2014
But the correct result would be the enddate as 24/04/2014 that's because my query doesn't take into account if the days after the last day is weekend days or holidays (or exceptions for that matter) as you can see in the dataset above the day 21/04/2014 which is outside my duration is also a Holiday.
I also tried to create a CTE on phase table to add a day for each iteration until the duration is over but I couldn't add the exceptions nor the holidays because the DB2 won't let me add a left join on the CTE recursion. Like this:
with CTE (projectid, start, enddate, duration, level) as (
select projectid, start, start as enddate, duration, 1
from phase
where project_id=1
and phase_id=1
UNION ALL
select projectid, start, enddate + (level days), duration,
case when isWorkDay(enddate + (level days)) then level+1 else level end as level
from CTE left join dayexception on ...
left join holiday on ...
where level < duration
) select * from CTE
PS: the above query doesn't work because of the DB2 limitations and isWorkDay is just as example (it would be a case on the dayexception and holiday table values).
If you have any doubts, please just ask in the comments.
Any help would be greatly appreciated. Thanks.
How to count business days forward and backwards.
Background last Century I worked at this company that used this technique. So this is a pseudo code answer. It worked great for their purposes.
What you need is a table that contains a date column and and id column that increments by one. Fill the table with only business dates... That's the tricky part because of the observing date on another date. Like 2017-01-02 was a holiday where I work but its not really a recognized holiday AFAIK.
How to get 200 business days in the future.
Select the min(id) where date >= to current date.
Select the date where id=id+200.
How to get 200 business days in the past.
Select the min(id) from table with a date >= to current date.
Select the date with id=id-200.
Business days between.
select count(*) from myBusinessDays where "date" between startdate and enddate
Good Luck as this is pseudo code.
So, using the idea of #danny117 answer I was able to create a query to solve my problem. Not exactly his idea but it gave me directions to solve it, so I will mark it as the correct answer and this answer is to share the actual code to solve it.
First let me share the view I created to the periods. As I said I created a view daysofyear with the data of 2014 and 2015 (in my final solution I added a considerable bigger interval without impacting in the end result). Ps: the date format here is in Brazil format dd/mm/yyyy
create or replace view daysofyear as
with CTE (curday, year, weekday) as (
select a1.firstday, year(a1.firstday), dayofweek(a1.firstday)
from (select to_date('01/01/1990', 'dd/mm/yyyy') firstday
from sysibm.sysdummy1) as a1
union all
select a.curday + 1 day as sumday,
year(a.curday + 1 day),
dayofweek(a.curday + 1 day)
from CTE a
where a.curday < to_date('31/12/2050', 'dd/mm/yyyy')
)
select * from cte;
With that View I then created another view with the query on my question adding an amount of days based on my historical data (bigger phase + a considerable margin) here it is:
create or replace view project_phase_days as
select ph.project_id proj,
ph.id phase_id pha,
ph.start,
dy.curday day,
dy.weekday, /*weekday here is a calling to the weekday function of db2*/
doe.exceptiondate dayexcp,
h.date holiday,
ph.duration,
case when exceptiondate is not null or (weekday not in (1,7) and h.date is null)
then 1 else 0 end as workday
from phase ph
inner join daysofyear dy
on (year(ph.start) = dy.year)
left join dayexception doe
on (ph.project_id = doe.project_id
and dy.curday = truncate(doe.exceptiondate))
left join holiday h
on (dy.curday = truncate(h.date))
where dy.year in (year(ph.start),year(ph.start)+1)
and dy.curday>=ph.start
and dy.curday<=ph.start + ((duration - 1) days) + 200 days
/*max duration in database is 110*/
After that I then created this query:
select p.id,
a.start,
a.curday as enddate
from project p left join
(
select p1.project_id,
p1.duration,
p1.start,
p1.curday,
row_number() over (partition by p1.project_id
order by p1.project_id, p1.start, p1.curday) rorder
from project_phase_days p1
where p1.validday=1
) as a
on (p.id = a.project_id
and a.rorder = a.duration)
order by p.id, a.start
What it does is select all workdays from my view (joined with my other days view) rownumber based on the project_id ordered by project_id, start date and current day (curday) I then join with the project table to get the trick part that solved the problem which is a.rorder = a.duration
If you guys need more explanation I will be glad to provide.
I am working with SQL Server 2005.
I want to fetch monthlyTotalAppoinment and monthlyEmployeewiseTotal from appointment table in single result.
Appointment Table
appoinmentId
appoinmentDate
employeeId
I can successfully fetch monthlyTotalAppoinment and also got employeewisetotaappoinment from following query, but I want monthly employeewiseappoinment.
SELECT *
FROM (SELECT Datename(M, Dateadd(M, NUMBER - 1, 0)) AS month
FROM MASTER..SPT_VALUES
WHERE TYPE = 'p'
AND NUMBER BETWEEN 1 AND 12) months
LEFT JOIN (SELECT Datename(MM, APPOINMENTDATE) month,
Count(APPOINMENTID) AS TotalAppointment
FROM APPOINTMENT
GROUP BY Datename(MM, APPOINMENTDATE)) appoinment
ON months.MONTH = appoinment.MONTH
I am getting following output.
but I want following output
appoinementId employeeId appoinemntDate
------------- ----------- ---------------
1 4 8/25/2012 12:00:00 AM
2 4 8/25/2012 12:00:00 AM
3 4 8/25/2012 12:00:00 AM
4 4 8/25/2012 12:00:00 AM
5 4 8/25/2012 12:00:00 AM
6 4 9/25/2012 12:00:00 AM
7 2 9/25/2012 12:00:00 AM
8 2 9/25/2012 12:00:00 AM
9 2 9/25/2012 12:00:00 AM
10 4 9/25/2012 12:00:00 AM
11 4 10/25/2012 12:00:00 AM
12 2 10/25/2012 12:00:00 AM
13 4 10/25/2012 12:00:00 AM
for above data cuming output(For EmployeeId 4)
Month MonthData Totalappoinemnt TotalEmployeewiseAppointmemt
------------- ----------- -------------- ------------------------------
January.. NULL.. NULL.. NULL..
Augest Augest 5 9
September September 5 9
October October 3 9
But i want following
Month MonthData Totalappoinemnt TotalEmployeewiseAppointmemt
------------- ----------- -------------- ------------------------------
January.. NULL.. NULL.. NULL..
Augest Augest 5 5
September September 5 2
October October 3 2
I'm missing some minor points in your question, but the big issues are dealt with in this query:
SELECT t1.*,
t2.EMP_COUNT
FROM (SELECT Datename(MONTH, APPOINEMNTDATE) Month_Name,
Count(*) app_count
FROM APPOINTMENTTABLE
GROUP BY Datename(MONTH, APPOINEMNTDATE))T1
LEFT JOIN (SELECT Count(*) emp_count,
Datename(MONTH, APPOINEMNTDATE) Month_Name
FROM APPOINTMENTTABLE
WHERE EMPLOYEEID = 4
GROUP BY Datename(MONTH, APPOINEMNTDATE))T2
ON t1.MONTH_NAME = t2.MONTH_NAME
A working example can be found here.
What is missing?
Couldn't figure out why you had 2 columns for months. If there is a reason for this I'll revise the code.
I only listed months with details available. I saw that January was also in the example. If you want all 12 months to show even if no data is available, let me know and it will be added.
Didn't use the exact same column names. I'm sure you can change them if you need to :-)