Missing a single day - sql

My database has two tables, a car table and a wheel table.
I'm trying to find the number of wheels that meet a certain condition over a range of days, but some days are not included in the output.
Here is the query:
USE CarDB
SELECT MONTH(c.DateTime1) 'Month',
DAY(c.DateTime1) 'Day',
COUNT(w.ID) 'Wheels'
FROM tblCar c
INNER JOIN tblWheel w
ON c.ID = w.CarID
WHERE c.DateTime1 BETWEEN '05/01/2013' AND '06/04/2013'
AND w.Measurement < 18
GROUP BY MONTH(c.DateTime1), DAY(c.DateTime1)
ORDER BY [ Month ], [ Day ]
GO
The output results seem to be correct, but days with 0 wheels do not show up. For example:
Sample Current Output:
Month Day Wheels
2 1 7
2 2 4
2 3 2 -- 2/4 is missing
2 5 9
Sample Desired Ouput:
Month Day Wheels
2 1 7
2 2 4
2 3 2
2 4 0
2 5 9
I also tried a left join but it didn't seem to work.

You were on the right track with a LEFT JOIN
Try run your query with this kind of outer join but remove your WHERE clause. Notice anything?
What's happening is that the join is applied and then the where clause removes the values that don't match the criteria. All this happens before the group by, meaning the cars are excluded.
Here's one method for you:
SELECT Year(cars.datetime1) As the_year
, Month(cars.datetime1) As the_month
, Day(cars.datetime1) As the_day
, Count(wheels.id) As wheels
FROM (
SELECT id
, datetime1
FROM tblcar
WHERE datetime1 BETWEEN '2013-01-05' AND '2013-04-06'
) As cars
LEFT
JOIN tblwheels As wheels
ON wheels.carid = cars.id
What's different this time round is that we're limiting the results of the car table before we join to the wheels table.

You probably want to use a LEFT OUTER JOIN:
USE CarDB
SELECT MONTH (c.DateTime1) 'Month', DAY (c.DateTime1) 'Day', COUNT (w.ID) 'Wheels'
FROM tblCar c LEFT OUTER JOIN tblWheel w ON c.ID = w.CarID
WHERE c.DateTime1 BETWEEN '05/01/2013' AND '06/04/2013'
AND (w.Measurement IS NULL OR w.Measurement < 18)
GROUP BY MONTH (c.DateTime1), DAY (c.DateTime1)
ORDER BY [Month], [Day]
GO
Aand then, you need to adapt the WHERE condition, as you want to keep the rows with w.Measurement being NULL due to the OUTER join.

Remove the join and change your select to this:
SELECT MONTH (c.DateTime1) 'Month', DAY (c.DateTime1) 'Day', isnull(select top 1 (select COUNT from tblWheel where id = tblCar.ID and Measurement < 18), 0) 'Wheels'

Related

SQL: How to get date range counts as rows instead of columns?

The generalized use case I have is to get record counts for a number of date ranges across one or more tables.
My specific use case is this:
For a patient encounter table (enc) and a pregnancy table (preg), get the counts of patients seen 9 months before the expected due date, 12 months before, 15 months before, etc.
I can get the data I need by doing an outer join on the encounter table with a where clause that boxes the time constraints. However, this seems to be inefficient, a lot of typing, and the data are not in the form I'd like (I would like each time window to be a row instead of a column).
Below is the query I currently have. How can I rewrite it to get the data row wise instead of column wise?
select
preg.org,
count(distinct nine.patient_id) `Pre-Delivery Visits (09 Months)`,
count(distinct twelve.patient_id) `Pre-Delivery Visits (12 Months)`,
count(distinct all.patient_id) `Pre-Delivery Visits (All)`,
count(distinct preg.patient_id) `All Pregnancies`
from
pregnancy preg
left outer join enc nine on preg.patient_id = nine.patient_id and nine.encounter_date < preg.est_delivery_date and nine.encounter_date > date_add(preg.est_delivery_date, (-30*9))
left outer join enc twelve on preg.patient_id = twelve.patient_id and twelve.encounter_date < preg.est_delivery_date and twelve.encounter_date > date_add(preg.est_delivery_date, (-30*12))
left outer join enc all on preg.patient_id = all.patient_id and all.encounter_date < preg.est_delivery_date
group by 1
;
Data are returned in this format:
org (09 Months) (12 months) (All) (All Pregnancies)
org x 1 10 15 20
org y 2 22 23 24
org z 200 202 230 250
I'd like to get the data like this
org time_box count
org x 09 mon 1
org y 09 mon 2
org z 09 mon 202
org x 12 mon 10
...
etc.
I'm not sure if this does what you want. This calculates non-overlapping groups, so 12 months is really 9-12 months:
select (case when e.encounter_date > p.est_delivery_date - 30*9 day
then 'nine'
when e.encounter_date > p.est_delivery_date - 30*12 day
then 'twelve'
when e.encounter_date is not null
then 'all pre-delivery'
else 'all pregnancy'
end) as grp,
count(distinct p.patient_id)
from pregnancy p left join
enc e
on e.patient_id = p.patient_id and
e.encounter_date < p.est_delivery_date
group by grp;

Fill in blank dates for rolling average - CTE in Snowflake

I have two tables – activity and purchase
Activity table:
user_id date videos_watched
1 2020-01-02 3
1 2020-01-04 5
1 2020-01-07 5
Purchase table:
user_id purchase_date
1 2020-01-01
2 2020-02-02
What I would like to do is to get a 30 day rolling average since purchase on how many videos has been watched.
The base query is like this:
SELECT
DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED)
FROM PURCHASE P
LEFT OUTER JOIN ACTIVITY A ON P.USER_ID = A.USER_ID AND
A.DATE >= P.PURCHASE_DATE AND A.DATE <= DATEADD(DAY, 30, P.PURCHASE_DATE)
GROUP BY 1;
However, the Activity table only has records for each day a video has been logged. I would like to fill in the blanks for days a video has not been viewed.
I have started to look into using a CTE like this:
WITH cte AS (
SELECT date('2020-01-01') as fdate
UNION ALL
SELECT CAST(DATEADD(day,1,fdate) as date)
FROM cte
WHERE fdate < date('2020-04-01')
) select * from cte
cross join purchases p
left outer join activity a
on p.user id = a.user_id
and a.fdate = p.purchase_date
and a.date >= p.purchase_date and a.date <= dateadd(day, 30, p.purchase_date)
The end goal is to have something like this:
days_since_purchase videos_watched
1 3
2 0 --CTE coalesce inserted value
3 0
4 5
Been trying for the last couple of hours to get it right, but still can't really get the hang of it.
If you want to fill in the gaps in the result set, then I think you should be generating integers rather than dates:
WITH cte AS (
SELECT 1 as day_since_purchase
UNION ALL
SELECT 1 + day_since_purchase
FROM cte
WHERE day_since_purchase < 4
)
SELECT cte.day_since_purchase, COALESCE(avg_videos_viewed, 0)
FROM cte LEFT JOIN
(SELECT DATEDIFF(DAY, p.purchase_date, a.date) AS day_since_purchase,
AVG(A.VIDEOS_VIEWED) as avg_videos_viewed
FROM purchases p JOIN
activity a
ON p.user id = a.user_id AND
a.fdate = p.purchase_date AND
a.date >= p.purchase_date AND
a.date <= dateadd(day, 30, p.purchase_date)
GROUP BY 1
) pa
ON pa.day_since_purchase = cte.day_since_purchase;
You can use a recursive query to generate the 30 days following each purchase, then bring the activity table:
with cte as (
select
purchase_date,
client_id,
0 days_since_purchase,
purchase_date dt
from purchases
union all
select
purchase_date,
client_id,
days_since_purchase + 1
dateadd(day, days_since_purchase + 1, purchase_date)
from cte
where days_since_purchase < 30
)
select
c.days_since_purchase,
avg(colaesce(a. videos_watch, 0)) avg_ videos_watch
from cte c
left join activity a
on a.client_id = c.client_id
and a.fdate = c.purchase_date
and a.date = c.dt
group by c.days_since_purchase
Your question is unclear on whether you have a column in the activity table that stores the purchase date each row relates to. Your query has column fdate but not your sample data. I used that column in the query (without such column, you might end up counting the same activity in different purchases).

Finding subscriber counts where End Date and Start Date are in same month

I want to get counts of subscribers who terminate and re-enroll in the same month when the Termination reason is 'xxx' for past 12 months in a row.
In the above example, subscriber '1245' has been terminated for 'xxx' reason but re-enrolled again in the same month. I want counts of how many times this happens month by month for last n number of months.
I tried below code to get PersonIDs but having trouble getting counts in a month.
SELECT DISTINCT PersonID FROM Membership A
INNER JOIN (SELECT StartYrMo FROM Membership) B
ON A.EndYrMo = B.StartYrMo
WHERE A.TermReason = 'xxx'
ORDER BY PersonID
EDIT There is more to it. I want all the PersonIds who does so for at least 6 months in a row. Meaning: Person 'A' is terminated due to 'xxx' reason in 201901. 'A' is again terminated due to same 'xxx' reason in 201902 through 201908 which is minimum of 6 months. I want Ids of everyone who does this.
This should do:
SELECT DISTINCT A.PersonID FROM Membership A
INNER JOIN Membership B ON A.PersonID = B.PersonID
INNER JOIN Membership C ON A.PersonID = C.PersonID
INNER JOIN Membership D ON A.PersonID = D.PersonID
WHERE A.TermReason = 'xxx'
AND B.TermReason = 'xxx'
AND C.TermReason = 'xxx'
AND D.TermReason = 'xxx'
AND B.StartMonth = A.EndMonth
-- following assumes minimum date in your dataset is '201101'.
-- '+ 89' returns '201201' instead of '201113' when EndMonth is '201112'.
AND C.StartMonth = CASE WHEN A.EndMonth IN ('201112', '201212', '201312', '201412', '201512', '201612', '201712', '201812', '201912')
THEN A.EndMonth + 89 ELSE A.EndMonth + 1 END
AND D.StartMonth = CASE WHEN A.EndMonth IN ('201112', '201212', '201312', '201412', '201512', '201612', '201712', '201812', '201912', '201111', '201211', '201311', '201411', '201511', '201611', '201711', '201811', '201911')
THEN A.EndMonth + 90 ELSE A.EndMonth + 2 END
-- repeat for additional months as needed.
ORDER BY PersonID
To reduce redundancy, this code only looks at 3 consecutive months. If it does not return any results, then you do not have to worry about writing additional lines of codes for other months (if it is a one time request). Because if there are no one in 3 consecutive months, then there shall be no one in 6 consecutive months. I hope this helps :)
To get resubscribed persons:
select m2.PersonID, m2.StartMonth as ResubscriptionMonth
from Membership m1 inner join Membership m2
on m1.PersonID = m2.PersonID
where m1.TermReason = 'xxx' and m1.EndMonth = m2.StartMonth
To get numbers by month:
select Resubscibtions.ResubscriptionMonth, Count(*)
from (
select m2.PersonID, m2.StartMonth as ResubscriptionMonth
from Membership m1 inner join Membership m2
on m1.PersonID = m2.PersonID
where m1.TermReason = 'xxx' and m1.EndMonth = m2.StartMonth) as Resubscibtions
group by Resubscibtions.ResubscriptionMonth

Trouble running a complex query in sql?

I am pretty new to SQL Server and just started playing with it. I am trying to create a table that shows attendance percentage by department.
So first i run this query:
SELECT CrewDesc, COUNT(*)
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
This gives a table like this:
Accounting 10
Marketing 5
Economics 20
Engineering 5
Machinery 5
Tech Support 10
Then i run another query:
SELECT DeptDescription, COUNT(*)
FROM database.Attendee
GROUP BY DeptDescription
This gives me the result of all the people that have attended meeting something like
Accounting 8
Marketing 5
Economics 15
Engineering 10
Tech Support 8
Then I get the current week in the year by SELECT Datepart(ww, GetDate()) as CurrentWeek To make this example easy lets assume this will be week "2".
Now the way i was going to create this was a table for each step but that seems like waste. Is there a way we can combine to tables in a query? So in the end result i would like a table like this
Total# Attd Week (Total*Week) Attd/(Total*week)%
Accounting 10 8 2 20 8/20
Marketing 5 5 2 10 5/10
Economics 20 15 2 40 15/40
Engineering 5 10 2 10 10/10
Machinery 5 NULL 2 10 0/10
Tech Support 10 8 2 20 8/20
Ok, note that my recommendation below is based on your exact existing queries - there are certainly other ways to construct this that may be more performant, but functionally this should work for your requirement. Also, it illustrates the key features of different join types that happen to be relevant for your request, as well as inline views (aka nested queries), which are a super-powerful technique in the SQL language as a whole.
select t1.CrewDesc, t1.Total, t2.Attd, t3.Week,
(t1.Total*t3.Week) as Total_x_Week,
case when isnull(t1.Total*t3.Week, 0) = 0 then 0 else isnull(t2.Attd, 0) / isnull(t1.Total*t3.Week, 0) end as PercentageAttd
from (
SELECT CrewDesc, COUNT(*) AS Total
FROM database.emp INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
) t1
left outer join /* left outer to keep all rows from t1 */ (
SELECT DeptDescription, COUNT(*) AS Attd
FROM database.Attendee GROUP BY DeptDescription
) t2
on t1.CrewDesc = t2.DeptDescription
cross join /* useful when adding a scalar value to all rows */ (
SELECT Datepart(ww, GetDate()) as Week
) t3
order by t1.CrewDesc
Good luck!
Try something like this
SELECT COALESCE(a.crewdesc,b.deptdescription),
a.total,
b.attd,
Datepart(ww, Getdate()) AS week,
total * Datepart(ww, Getdate()),
b.attd/(a.total*Datepart(ww, Getdate()))
FROM (query 1) a
FULL OUTER JOIN (query 2) b
ON a.crewdesc = b.deptdescription
WITH Total AS ( SELECT CrewDesc, COUNT(*) AS [Count]
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
),
Attd AS ( SELECT DeptDescription, COUNT(*) AS [Count]
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT COALESCE(CrewDesc,DeptDescription) AS [Dept],
Total.[Count] AS [Total#],Attd.[Count] AS [Attd],
Total.[Count] * Datepart(ww, GetDate()) AS [(Total*Week)],
CAST(Attd.[Count] AS VARCHAR(10))+'/'+ CAST((Total.[Count] * Datepart(ww, GetDate()))AS VARCHAR(10)) AS [Attd/(Total*week)%]
FROM Total INNER JOIN Attd ON Total.CrewDesc = Attd.DeptDescription
I'm assuming your queries are correct -- you give no real information about your model so I've no way to know. They look wrong since the same data is called CrewDesc in one table and Dept in another. Also the join sim1 = sim2 seems very strange to me. In any case given the queries you posted this will work.
With TAttend as
(
SELECT CrewDesc, COUNT(*) as TotalNum
FROM database.emp
INNER JOIN database.crew on sim1 = sim2
GROUP BY CrewDesc
), Attend as
(
SELECT DeptDescription, COUNT(*) as Attd
FROM database.Attendee
GROUP BY DeptDescription
)
SELECT CrewDesc as Dept, TotalNum, ISNULL(Attd, 0) as Attd ,Datepart(ww, GetDate()) as Week,
CASE WHEN ISNULL(Attd, 0) > 0 THEN 0
ELSE ISNULL(Attd, 0) / (TotalNum * Datepart(ww, GetDate()) ) END AS Percent
FROM TAttend
LEFT JOIN Attend on CrewDesc = DeptDescription

SQL Query: Calculating the deltas in a time series

For a development aid project I am helping a small town in Nicaragua improving their water-network-administration.
There are about 150 households and every month a person checks the meter and charges the houshold according to the consumed water (reading from this month minus reading from last month). Today all is done on paper and I would like to digitalize the administration to avoid calculation-errors.
I have an MS Access Table in mind - e.g.:
*HousholdID* *Date* *Meter*
0 1/1/2013 100
1 1/1/2013 130
0 1/2/2013 120
1 1/2/2013 140
...
From this data I would like to create a query that calculates the consumed water (the meter-difference of one household between two months)
*HouseholdID* *Date* *Consumption*
0 1/2/2013 20
1 1/2/2013 10
...
Please, how would I approach this problem?
This query returns every date with previous date, even if there are missing months:
SELECT TabPrev.*, Tab.Meter as PrevMeter, TabPrev.Meter-Tab.Meter as Diff
FROM (
SELECT
Tab.HousholdID,
Tab.Data,
Max(Tab_1.Data) AS PrevData,
Tab.Meter
FROM
Tab INNER JOIN Tab AS Tab_1 ON Tab.HousholdID = Tab_1.HousholdID
AND Tab.Data > Tab_1.Data
GROUP BY Tab.HousholdID, Tab.Data, Tab.Meter) As TabPrev
INNER JOIN Tab
ON TabPrev.HousholdID = Tab.HousholdID
AND TabPrev.PrevData=Tab.Data
Here's the result:
HousholdID Data PrevData Meter PrevMeter Diff
----------------------------------------------------------
0 01/02/2013 01/01/2013 120 100 20
1 01/02/2013 01/01/2012 140 130 10
The query above will return every delta, for every households, for every month (or for every interval). If you are just interested in the last delta, you could use this query:
SELECT
MaxTab.*,
TabCurr.Meter as CurrMeter,
TabPrev.Meter as PrevMeter,
TabCurr.Meter-TabPrev.Meter as Diff
FROM ((
SELECT
Tab.HousholdID,
Max(Tab.Data) AS CurrData,
Max(Tab_1.Data) AS PrevData
FROM
Tab INNER JOIN Tab AS Tab_1
ON Tab.HousholdID = Tab_1.HousholdID
AND Tab.Data > Tab_1.Data
GROUP BY Tab.HousholdID) As MaxTab
INNER JOIN Tab TabPrev
ON TabPrev.HousholdID = MaxTab.HousholdID
AND TabPrev.Data=MaxTab.PrevData)
INNER JOIN Tab TabCurr
ON TabCurr.HousholdID = MaxTab.HousholdID
AND TabCurr.Data=MaxTab.CurrData
and (depending on what you are after) you could only filter current month:
WHERE
DateSerial(Year(CurrData), Month(CurrData), 1)=
DateSerial(Year(DATE()), Month(DATE()), 1)
this way if you miss a check for a particular household, it won't show.
Or you might be interested in showing last month present in the table (which can be different than current month):
WHERE
DateSerial(Year(CurrData), Month(CurrData), 1)=
(SELECT MAX(DateSerial(Year(Data), Month(Data), 1))
FROM Tab)
(here I am taking in consideration the fact that checks might be on different days)
I think the best approach is to use a correlated subquery to get the previous date and join back to the original table. This ensures that you get the previous record, even if there is more or less than a 1 month lag.
So the right query looks like:
select t.*, tprev.date, tprev.meter
from (select t.*,
(select top 1 date from t t2 where t2.date < t.date order by date desc
) prevDate
from t
) join
t tprev
on tprev.date = t.prevdate
In an environment such as the one you describe, it is very important not to make assumptions about the frequency of reading the meter. Although they may be read on average once per month, there will always be exceptions.
Testing with the following data:
HousholdID Date Meter
0 01/12/2012 100
1 01/12/2012 130
0 01/01/2013 120
1 01/01/2013 140
0 01/02/2013 120
1 01/02/2013 140
The following query:
SELECT a.housholdid,
a.date,
b.date,
a.meter,
b.meter,
a.meter - b.meter AS Consumption
FROM (SELECT *
FROM water
WHERE Month([date]) = Month(Date())
AND Year([date])=year(Date())) a
LEFT JOIN (SELECT *
FROM water
WHERE DateSerial(Year([date]),Month([date]),Day([date]))
=DateSerial(Year(Date()),Month(Date())-1,Day([date])) ) b
ON a.housholdid = b.housholdid
The above query selects the records for this month Month([date]) = Month(Date()) and compares them to records for last month ([date]) = Month(Date()) - 1)
Please do not use Date as a field name.
Returns the following result.
housholdid a.date b.date a.meter b.meter Consumption
0 01/02/2013 01/01/2013 120 100 20
1 01/02/2013 01/01/2013 140 130 10
Try
select t.householdID
, max(s.theDate) as billingMonth
, max(s.meter)-max(t.meter) as waterUsed
from myTbl t join (
select householdID, max(theDate) as theDate, max(meter) as meter
from myTbl
group by householdID ) s
on t.householdID = s.householdID and t.theDate <> s.theDate
group by t.householdID
This works in SQL not sure about access
You can use the LAG() function in certain SQL dialects. I found this to be much faster and easier to read than joins.
Source: http://blog.jooq.org/2015/05/12/use-this-neat-window-function-trick-to-calculate-time-differences-in-a-time-series/