SQL Loop Count of people in program during specified duration - sql

I'm not sure if there should be a loop for this or what the easiest approach would be.
My data consists of a list of people participating in our program. They have various start and end dates, but the following equation is able to capture the number of people who participated on a specific date:
DECLARE #PopulationDate DATETIME = '2018-06-01 05:00:00';
select count(People)
FROM Program_Log
WHERE
START_TIME <= #PopulationDate
AND (END_TIME >= #PopulationDate OR END_TIME IS NULL)`
Is there a way I can loop in different date values to get the number of program participants each day for an entire year?
Multiple years?

One simple way is to use a CTE to generate the dates and then a left join to bring in the data. For instance, the following gets the counts as of the first of the month for this year:
with dates as (
select cast('2018-01-01' as date) as dte
union all
select dateadd(month, 1, dte)
from dates
where dte < getdate()
)
select d.dte, count(pl.people)
from dates d left join
program_log pl
on pl.start_time <= d.dte and (pl.end_time >= d.dte or pl.end_time is null)
group by d.dte
order by d.dte;
Note that this will work best for a handful of dates. If you want more than 100, you need to add option (maxrecursion 0) to the end of the query.
Also, count(people) is highly suspicious. Perhaps you mean sum(people) or something similar.

Related

PL-SQL query to calculate customers per period from start and stop dates

I have a PL-SQL table with a structure as shown in the example below:
I have customers (customer_number) with insurance cover start and stop dates (cover_start_date and cover_stop_date). I also have dates of accidents for those customers (accident_date). These customers may have more than one row in the table if they have had more than one accident. They may also have no accidents. And they may also have a blank entry for the cover stop date if their cover is ongoing. Sorry I did not design the data format, but I am stuck with it.
I am looking to calculate the number of accidents (num_accidents) and number of customers (num_customers) in a given time period (period_start), and from that the number of accidents-per-customer (which will be easy once I've got those two pieces of information).
Any ideas on how to design a PL-SQL function to do this in a simple way? Ideally with the time periods not being fixed to monthly (for example, weekly or fortnightly too)? Ideally I will end up with a table like this shown below:
Many thanks for any pointers...
You seem to need a list of dates. You can generate one in the query and then use correlated subqueries to calculate the columns you want:
select d.*,
(select count(distinct customer_id)
from t
where t.cover_start_date <= d.dte and
(t.cover_end_date > d.date + interval '1' month or t.cover_end_date is null)
) as num_customers,
(select count(*)
from t
where t.accident_date >= d.dte and
t.accident_date < d.date + interval '1' month
) as accidents,
(select count(distinct customer_id)
from t
where t.accident_date >= d.dte and
t.accident_date < d.date + interval '1' month
) as num_customers_with_accident
from (select date '2020-01-01' as dte from dual union all
select date '2020-02-01' as dte from dual union all
. . .
) d;
If you want to do arithmetic on the columns, you can use this as a subquery or CTE.

Best way to sum() values between two dates (where the second date is stored as a value in the next row) - postgres

I am a little bit stuck with an issue I'm having with a psql query. I know how I'd approach this using a loop etc, but have hit a wall in SQL as I'm no expert.
If we imagine that there are 3 terms per school year. Each child may recieve an allowance for lunches each month. I would like to SUM() the allowance for each month, from the start of the term up until the next term.
Where I am stuck is that the date of the next term is a value in the next row of data in the terms table.
I have something like this:
SELECT
terms."startDate",
COALESCE((
SELECT
SUM("lunch")
FROM
"allowance"
WHERE
TO_CHAR(terms."startDate", 'YYYY-MM') >= TO_CHAR("date", 'YYYY-MM')
AND
TO_CHAR(terms."startDate", 'YYYY-MM') < TO_CHAR(??? HELP ???, 'YYYY-MM')
),0) AS "lunchMoney"
FROM "schoolTerms" AS terms
...
Where I have put TO_CHAR(??? HELP ???, 'YYYY-MM') I would like to reference the start date of the childs next term. I have looked into using a LEAD() method but couldn't figure it out.
Any help would be greatly appreciated.
I think you want and join and aggregation:
select t.startdate, t.enddate, sum(a.lunch) as lunch_money
from schoolterms t
inner join allowance a on a.date >= t.startdate and a.date < t.enddate
group by t.startdate, t.enddate
This puts each allowance in the terms it belongs, and then aggregate by term. You might want a left join, if there may be terms without any allowance.
Your current query gives no clue about what a "child" is. Presumably, that should be a column in allowance, that you might want to put in the select and group by clauses.
If you want to compute the end_date as the "next" start_date, then use lead():
select t.startdate, t.enddate, sum(a.lunch) as lunch_money
from (
select start_date,
lead(startdate) over(order by startdate) enddate
from schoolterms
) t
inner join allowance a
on a.date >= t.startdate
and (a.date < t.enddate or t.enddate is null)
group by t.startdate, t.enddate

SQL query count by week with criteria

I have to write a trend report for the amount of standing scaffolds in the database by week.
I can get a count of scaffolds erected by week in the example below and also dismantles using the same query but this isn't what I need.
SELECT COUNT(scaffID) Erected, WeekStart
FROM
(
SELECT ScaffID,
dateadd(week, datediff(day,0,Erected) / 7, 0) AS WeekStart
FROM Scaffolds
) o
GROUP BY WeekStart
I can get my standing scaffolds from this by putting in a date but I want the standing scaffolds on every Friday say.
Declare #staticDate As DateTime
Set #staticDate = '2/1/2015'
Select COUNT(scaffID) As StandingScaffolds
from RequestInfo
Where ( ErectDate<= #staticDate )
And ( DismantleDate>= #staticDate
or DismantleDate Is NULL
)
This is driving me crazy so any help would be extremely appreciated.
Phil
this should give you something to start with..
DECLARE
#startDate date = '2015-01-01',
#endDate date = '2015-01-28';
with myWeeks (myWeek) AS (
select DATEPART(WEEK,#startDate) myWeek
UNION ALL
select myWeek + 1 from myWeeks
where
myWeek < DATEPART(WEEK,#endDate)
)
select
w,
COUNT(s.Erected) standingScaffolds
from myWeeks w
left join Scaffolds s on
w.myWeek between DATEPART(WEEK,s.Erected) and DATEPART(WEEK,s.Dismantled)
You might want to generate a date table for this (=table with one row for each day). You can then join that with this table to get the calculations quite easily.
select d.date, count(s.scaffID)
from date d, scaffolds s
where s.erected <= d.date and
(dismantled>= d.date or dismantled is NULL) and
d.date >= #stardate and
d.date <= enddate
group by d.date
Hopefully this is ok, can't test right now.
The date table is quite useful in other cases too, for example you can have local holidays there.

How to count records for each day in a range (including days without records)

I'm trying to refine this question a little since I didn't really ask correctly last time. I am essentially doing this query:
Select count(orders)
From Orders_Table
Where Order_Open_Date<=To_Date('##/##/####','MM/DD/YYYY')
and Order_Close_Date>=To_Date('##/##/####','MM/DD/YYYY')
Where ##/##/#### is the same day. In essence this query is designed to find the number of 'open' orders on any given day. The only problem is I'm wanting to do this for each day of a year or more. I think if I knew how to define the ##/##/#### as a variable and then grouped the count by that variable then I could get this to work but I'm not sure how to do that-or there may be another way as well. I am currently using Oracle SQL on SQL developer. Thanks for any input.
You could use a "row generator" technique like this (edited for Hogan's comments):
Select RG.Day,
count(orders)
From Orders_Table,
(SELECT trunc(SYSDATE) - ROWNUM as Day
FROM (SELECT 1 dummy FROM dual)
CONNECT BY LEVEL <= 365
) RG
Where RG.Day <=To_Date('##/##/####','MM/DD/YYYY')
and RG.Day >=To_Date('##/##/####','MM/DD/YYYY')
and Order_Open_Date(+) <= RG.Day
and Order_Close_Date(+) >= RG.Day - 1
Group by RG.Day
Order by RG.Day
This should list each day of the previous year with the corresponding number of orders
Lets say you had a table datelist with a column adate
aDate
1/1/2012
1/2/2012
1/3/2012
Now you join that to your table
Select *
From Orders_Table
join datelist on Order_Open_Date<=adate and Order_Close_Date>=adate
This gives you a list of all the orders you care about, now you group by and count
Select aDate, count(*)
From Orders_Table
join datelist on Order_Open_Date<=adate and Order_Close_Date>=adate
group by adate
If you want to pass in a parameters then just generate the dates with a recursive cte
with datelist as
(
select #startdate as adate
UNION ALL
select adate + 1
from datelist
where (adate + 1) <= #lastdate
)
Select aDate, count(*)
From Orders_Table
join datelist on Order_Open_Date<=adate and Order_Close_Date>=adate
group by adate
NOTE: I don't have an Oracle DB to test on so I might have some syntax wrong for this platform, but you get the idea.
NOTE2: If you want all dates listed with 0 for those that have nothing use this as your select statement:
Select aDate, count(Order_Open_Date)
From Orders_Table
left join datelist on Order_Open_Date<=adate and Order_Close_Date>=adate
group by adate
If you want only one day you can query using TRUNC like this
select count(orders)
From orders_table
where trunc(order_open_date) = to_date('14/05/2012','dd/mm/yyyy')

SQL for counting events by date

I feel like I've seen this question asked before, but neither the SO search nor google is helping me... maybe I just don't know how to phrase the question. I need to count the number of events (in this case, logins) per day over a given time span so that I can make a graph of website usage. The query I have so far is this:
select
count(userid) as numlogins,
count(distinct userid) as numusers,
convert(varchar, entryts, 101) as date
from
usagelog
group by
convert(varchar, entryts, 101)
This does most of what I need (I get a row per date as the output containing the total number of logins and the number of unique users on that date). The problem is that if no one logs in on a given date, there will not be a row in the dataset for that date. I want it to add in rows indicating zero logins for those dates. There are two approaches I can think of for solving this, and neither strikes me as very elegant.
Add a column to the result set that lists the number of days between the start of the period and the date of the current row. When I'm building my chart output, I'll keep track of this value and if the next row is not equal to the current row plus one, insert zeros into the chart for each of the missing days.
Create a "date" table that has all the dates in the period of interest and outer join against it. Sadly, the system I'm working on already has a table for this purpose that contains a row for every date far into the future... I don't like that, and I'd prefer to avoid using it, especially since that table is intended for another module of the system and would thus introduce a dependency on what I'm developing currently.
Any better solutions or hints at better search terms for google? Thanks.
Frankly, I'd do this programmatically when building the final output. You're essentially trying to read something from the database which is not there (data for days that have no data). SQL isn't really meant for that sort of thing.
If you really want to do that, though, a "date" table seems your best option. To make it a bit nicer, you could generate it on the fly, using i.e. your DB's date functions and a derived table.
I had to do exactly the same thing recently. This is how I did it in T-SQL (
YMMV on speed, but I've found it performant enough over a coupla million rows of event data):
DECLARE #DaysTable TABLE ( [Year] INT, [Day] INT )
DECLARE #StartDate DATETIME
SET #StartDate = whatever
WHILE (#StartDate <= GETDATE())
BEGIN
INSERT INTO #DaysTable ( [Year], [Day] )
SELECT DATEPART(YEAR, #StartDate), DATEPART(DAYOFYEAR, #StartDate)
SELECT #StartDate = DATEADD(DAY, 1, #StartDate)
END
-- This gives me a table of all days since whenever
-- you could select #StartDate as the minimum date of your usage log)
SELECT days.Year, days.Day, events.NumEvents
FROM #DaysTable AS days
LEFT JOIN (
SELECT
COUNT(*) AS NumEvents
DATEPART(YEAR, LogDate) AS [Year],
DATEPART(DAYOFYEAR, LogDate) AS [Day]
FROM LogData
GROUP BY
DATEPART(YEAR, LogDate),
DATEPART(DAYOFYEAR, LogDate)
) AS events ON days.Year = events.Year AND days.Day = events.Day
Create a memory table (a table variable) where you insert your date ranges, then outer join the logins table against it. Group by your start date, then you can perform your aggregations and calculations.
The strategy I normally use is to UNION with the opposite of the query, generally a query that retrieves data for rows that don't exist.
If I wanted to get the average mark for a course, but some courses weren't taken by any students, I'd need to UNION with those not taken by anyone to display a row for every class:
SELECT AVG(mark), course FROM `marks`
UNION
SELECT NULL, course FROM courses WHERE course NOT IN
(SELECT course FROM marks)
Your query will be more complex but the same principle should apply. You may indeed need a table of dates for your second query
Option 1
You can create a temp table and insert dates with the range and do a left outer join with the usagelog
Option 2
You can programmetically insert the missing dates while evaluating the result set to produce the final output
WITH q(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
qq(n) AS
(
SELECT 0
UNION ALL
SELECT n + 1
FROM q
WHERE n < 99
),
dates AS
(
SELECT q.n * 100 + qq.n AS ndate
FROM q, qq
)
SELECT COUNT(userid) as numlogins,
COUNT(DISTINCT userid) as numusers,
CAST('2000-01-01' + ndate AS DATETIME) as date
FROM dates
LEFT JOIN
usagelog
ON entryts >= CAST('2000-01-01' AS DATETIME) + ndate
AND entryts < CAST('2000-01-01' AS DATETIME) + ndate + 1
GROUP BY
ndate
This will select up to 10,000 dates constructed on the fly, that should be enough for 30 years.
SQL Server has a limitation of 100 recursions per CTE, that's why the inner queries can return up to 100 rows each.
If you need more than 10,000, just add a third CTE qqq(n) and cross-join with it in dates.