Query results per month instead of a whole period - sql

What I'm basically getting now is in a span of 6 months
PATIENT - Number of events in 6 months
1
2
3
4
5
What i want to get is
PATIENT - Number of events in 1st month - ... 2nd month - ... 3rd month .. till 6th month
1
2
3
4
5
Which ofc the results are the same, only that they're divided by month instead all in one column
CREATE PROCEDURE PMP.TOP5BonosVencidos
(
#pComienzo_Semestre datetime = null
)
AS
BEGIN
select TOP 5 PACIENTE_DOCUMENTO, PACIENTE_NOMBRE, PACIENTE_APELLIDO, COUNT(*)
from PMP.BONO_FARMACIA, PMP.PACIENTE, PMP.COMPRA_BONO
where CAST(FECHA_VENCIMIENTO AS DATE) >= CAST(#pComienzo_Semestre AS DATE) AND
CAST(FECHA_VENCIMIENTO AS DATE) < DATEADD(month,6,CAST(#pComienzo_Semestre AS DATE)) AND
PMP.PACIENTE.PACIENTE_ID = COMPRA_BONO_PACIENTE_ID AND
COMPRA_BONO_CANTIDAD_FARMACIA > 0 AND
COMPRA_BONO_ID = COMPRA_BONO
group by PACIENTE_DOCUMENTO, PACIENTE_NOMBRE, PACIENTE_APELLIDO
order by COUNT(*) DESC
END
GO

You can just add month to the grouping:
select TOP 5 PACIENTE_DOCUMENTO, PACIENTE_NOMBRE, PACIENTE_APELLIDO, DATEPART(Month, FECHA_VENCIMIENTO) as [Month], COUNT(*)
from PMP.BONO_FARMACIA, PMP.PACIENTE, PMP.COMPRA_BONO
where CAST(FECHA_VENCIMIENTO AS DATE) >= CAST(#pComienzo_Semestre AS DATE) AND
CAST(FECHA_VENCIMIENTO AS DATE) < DATEADD(month,6,CAST(#pComienzo_Semestre AS DATE)) AND
PMP.PACIENTE.PACIENTE_ID = COMPRA_BONO_PACIENTE_ID AND
COMPRA_BONO_CANTIDAD_FARMACIA > 0 AND
COMPRA_BONO_ID = COMPRA_BONO
group by PACIENTE_DOCUMENTO, PACIENTE_NOMBRE, PACIENTE_APELLIDO, DATEPART(Month, FECHA_VENCIMIENTO)
I removed ORDER BY as I'm not sure what the ordering should be in this case - please add the relevant ordering yourself.

Related

Date filtering in SQL

Table below consists of 2 columns: a unique identifier and date. I am trying to build a new column of episodes, where a new episode would be triggered when >= 3 months between dates. This process should occur for each unique EMID. In the table attached, EMID ending in 98 would only have 1 episode, there are no intervals >2 months between each row in the date column. However, EMID ending in 03 would have 2 episodes, as there is almost a 3 year gap between rows 12 and 13. I have tried the following code, which doesn't work.
Table:
SELECT TOP (1000) [EMID],[Date]
CASE
WHEN DATEDIFF(month, Date, LEAD Date) <3
THEN "1"
ELSE IF DATEDIFF(month, Date, LEAD Date) BETWEEN 3 AND 5
THEN "2"
ELSE "3"
END episode
FROM [res_treatment_escalation].[dbo].[cspine42920a]
EDIT: Using Microsoft SQL Server Management Studio.
EDIT 2: I have made some progress but the output is not exactly what I am looking for. Here is the query I used:
SELECT TOP (1000) [EMID],[visit_date_01],
CASE
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (partition by EMID order by EMID)) <= 90 THEN '1'
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID ORDER BY EMID)) BETWEEN 90 AND 179 THEN '2'
WHEN DATEDIFF(DAY, visit_date_01, LAG(visit_date_01,1,getdate()) OVER (PARTITION BY EMID order by EMID)) > 180 THEN '3'
END AS EPISODE
FROM [res_treatment_escalation].[dbo].['c-spine_full_dataset_4#29#20_wi$']
table2Here is the actual vs expected output
The partition by EMID does not seem to be working correctly. Every time there is a new EMID a new episode is triggered. I am using day instead of month as the filter in DATEDIFF- this does not seem to recognize new episodes within the same EMID
Hmmm: Use LAG() to get the previous date. Use a date comparison to assign a flag and then a cumulative sum:
select c.*,
sum(case when prev_date > dateadd(month, -3, date) then 0 else 1 end) over
(partition by emid order by date) as episode_number
from (select c.*, lag(date) over (partition by emid order by date) as prev_date
from res_treatment_escalation.dbo.cspine42920a c
) c;

SQL - Find the two closest date after a specific date

Dear Stack Overflow community,
I am looking for the patient id where the two consecutive dates after the very first one are less than 7 days.
So differences between 2nd and 1st date <= 7 days
and differences between 3rd and 2nd date <= 7 days
Example:
ID Date
1 9/8/2014
1 9/9/2014
1 9/10/2014
2 5/31/2014
2 7/20/2014
2 9/8/2014
For patient 1, the two dates following it are less than 7 days apart.
For patient 2 however, the following date are more than 7 days apart (50 days).
I am trying to write an SQL query that just output the patient id "1".
Thanks for your help :)
You want to use lead(), but this is complicated because you want this only for the first three rows. I think I would go for:
select t.*
from (select t.*,
lead(date, 1) over (partition by id order by date) as next_date,
lead(date, 2) over (partition by id order by date) as next_date_2,
row_number() over (partition by id order by date) as seqnum
from t
) t
where seqnum = 1 and
next_date <= date + interval '7' day and
next_date2 <= next_date + interval '7' day;
You can try using window function lag()
select * from
(
select id,date,lag(date) over(order by date) as prevdate
from tablename
)A where datediff(day,date,prevdate)<=7

SQL Comparison Query Error

I have a table with transaction history for 3 years, I need to compare the sum ( transaction) for 12 months with sum( transaction) for 4 weeks and display the customer list with the result set.
Table Transaction_History
Customer_List Transaction Date
1 200 01/01/2014
2 200 01/01/2014
1 100 10/24/2014
1 100 11/01/2014
2 200 11/01/2014
The output should have only Customer_List with 1 because sum of 12 months transactions equals sum of 1 month transaction.
I am confused about how to find the sum for 12 months and then compare with same table sum for 4 weeks.
the query below will work, except your sample data doesnt make sense
total for customer 1 for the last 12 months in your data set = 400
total for customer 1 for the last 4 weeks in your data set = 200
unless you want to exclude the last 4 weeks, and not be a part of the last 12 months?
then you would change the "having clause" to:
having
sum(case when Dt >= '01/01/2014' and dt <='12/31/2014' then (trans) end) - sum(case when Dt >= '10/01/2014' and dt <= '11/02/2014' then (trans) end) =
sum(case when Dt >= '10/01/2014' and dt <= '11/02/2014' then (trans) end)
of course doing this would mean your results would be customer 1 and 2
create table #trans_hist
(Customer_List int, Trans int, Dt Date)
insert into #trans_hist (Customer_List, Trans , Dt ) values
(1, 200, '01/01/2014'),
(2, 200, '01/01/2014'),
(1, 100, '10/24/2014'),
(1, 100, '11/01/2014'),
(2, 200, '11/01/2014')
select
Customer_List
from #trans_hist
group by
Customer_List
having
sum(case when Dt >= '01/01/2014' and dt <='12/31/2014' then (trans) end) =
sum(case when Dt >= '10/01/2014' and dt <= '11/02/2014' then (trans) end)
drop table #trans_hist
I suggest a self join.
select yourfields
from yourtable twelvemonths join yourtable fourweeks on something
where fourweek.something is within a four week period
and twelvemonths.something is within a 12 month period
You should be able to work out the details.
If your transactions are always positive and you want customers whose 12-month totals equal the 4-week total, then you want customers who have transactions in the past four weeks but not in the preceding 12 months - 4 weeks.
You can get this more directly using aggregation and a having clause. The logic is to check for any transactions in the past year that occurred before the previous 4 weeks:
select Customer_List
from Transaction_History
where date >= dateadd(month, -12, getdate())
group by CustomerList
having min(date) >= dateadd(day, -4 * 7, getdate());
Look here for methods to aggregate by month, year, etc.
http://weblogs.sqlteam.com/jeffs/archive/2007/09/10/group-by-month-sql.aspx

I want to get the data as zero for the date where data's are not present

Query:
select logindate, count(*) as people
from authortable
group by logindate
order by logindatedesc nulls last
Output:
logindate people
6-oct-2014 5
5-oct-2014 7
4-oct-2014 4
3-oct-2014 8
2-oct-2014 0
1-oct-2014 0
30-sept-2014 5
29-sept-2014 7
28-sept-2014 4
27-sept-2014 8
I am getting the data I required, but I want something as if there is no login on a particular day, it should return 0 .. as in 1 and 2 oct. I want to get the data for 1 and 2 as shown above.
For now I am not getting 1 and 2 oct rows as no data is present
Here is an example which generate the missing date rows:
with daterange as
(select min(logindate) startdate
, max(logindate) enddate
from authortable)
, dates as
(select startdate + (level-1) logindate
from daterange
connect by startdate + (level-1) <= enddate)
, logincount as
(select logindate
, count(*) people
from authortable
group by logindate)
select d.logindate
, nvl(l.people, 0) people
from logincount l
right outer join dates d
on (d.logindate = l.logindate)
order by d.logindate desc nulls last
EDIT: Added missing group_by (as noted by nop77svk)

Growth Of Distinct Users Per Week

I need to get a report that shows distinct users per week to show user growth per week, but I need it to show cumulative distinct users.
So if I have 5 weeks of data, I want to show:
Distinct users from week 0 through week 1
Distinct users from week 0 through week 2
Distinct users from week 0 through week 3
Distinct users from week 0 through week 4
Distinct users from week 0 through week 5
I have a whole year's worth of data. The only way I know how to do this is to literally query the time ranges adjusting a week out at a time and this is very tedious. I just can't figure out how I could query everything from week 0 through week 1 all the way to week 0 through week 52.
EDIT - What I have so far:
select count(distinct user_id) as count
from tracking
where datepart(wk,login_dt_tm) >= 0 and datepart(wk,login_dt_tm) <= 1
Then I take that number, record it, and update it to -- datepart(wk,login_dt_tm) <= 2. And so on until I have all the weeks. That way I can chart a nice growth chart by week.
This is tedious and there has to be another way.
UPDATE-
I used the solution provided by #siyual but updated it to use a table variable so I could get all the results in one output.
Declare #Week Int = 0
Declare #Totals Table
(
WeekNum int,
UserCount int
)
While #Week < 52
Begin
insert into #Totals (WeekNum,UserCount)
select #Week,count(distinct user_id) as count
from tracking
where datepart(wk,login_dt_tm) >= #Week and datepart(wk,login_dt_tm) <= (#Week + 1)
Set #Week += 1
End
Select * from #Totals
Why not something like:
select count(distinct user_id) as count, datepartk(wk, login_dt_tm) as week
from tracking
group by datepart(wk,login_dt_tm)
order by week
You could try something like this:
Declare #Week Int = 1
While #Week <= 52
Begin
select count(distinct user_id) as count
from tracking
where datepart(wk,login_dt_tm) >= 0 and datepart(wk,login_dt_tm) <= #Week
Set #Week += 1
End
Just for the record, I would do this in one statement, using a recursive CTE to generate the numbers from 1 to 52 (you could also use a numbers table):
with numbers as (
select 1 as n
union all
select n + 1
from numbers
where n < 52
)
select count(distinct user_id) as count
from tracking t join
numbers n
on datepart(wk, login_dt_tm) >= 0 and datepart(wk, login_dt_tm) <= numbers.n;
Seems easier to put it all in one query.
SELECT
week_num,
distinct_count
FROM (
select distinct
datepart(wk,login_dt_tm) week_num
from #tracking
) t_week
CROSS APPLY (
select
count(distinct user_id) distinct_count
from #tracking
where datepart(wk,login_dt_tm) between 0 and t_week.week_num
) t_count