Total Count of Active Employees by Date - sql

I have in the past written queries that give me counts by date (hires, terminations, etc...) as follows:
SELECT per.date_start AS "Date",
COUNT(peo.EMPLOYEE_NUMBER) AS "Hires"
FROM hr.per_all_people_f peo,
hr.per_periods_of_service per
WHERE per.date_start BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE
AND per.date_start BETWEEN :PerStart AND :PerEnd
AND per.person_id = peo.person_id
GROUP BY per.date_start
I was now looking to create a count of active employees by date, however I am not sure how I would date the query as I use a range to determine active as such:
SELECT COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo
WHERE peo.current_employee_flag = 'Y'
and TRUNC(sysdate) BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE

Here is a simple way to get started. This works for all the effective and end dates in your data:
select thedate,
SUM(num) over (order by thedate) as numActives
from ((select effective_start_date as thedate, 1 as num from hr.per_periods_of_service) union all
(select effective_end_date as thedate, -1 as num from hr.per_periods_of_service)
) dates
It works by adding one person for each start and subtracting one for each end (via num) and doing a cumulative sum. This might have duplicates dates, so you might also do an aggregation to eliminate those duplicates:
select thedate, max(numActives)
from (select thedate,
SUM(num) over (order by thedate) as numActives
from ((select effective_start_date as thedate, 1 as num from hr.per_periods_of_service) union all
(select effective_end_date as thedate, -1 as num from hr.per_periods_of_service)
) dates
) t
group by thedate;
If you really want all dates, then it is best to start with a calendar table, and use a simple variation on your original query:
select c.thedate, count(*) as NumActives
from calendar c left outer join
hr.per_periods_of_service pos
on c.thedate between pos.effective_start_date and pos.effective_end_date
group by c.thedate;

If you want to count all employees who were active during the entire input date range
SELECT COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo
WHERE peo.[EFFECTIVE_START_DATE] <= :StartDate
AND (peo.[EFFECTIVE_END_DATE] IS NULL OR peo.[EFFECTIVE_END_DATE] >= :EndDate)

Here is my example based on Gordon Linoff answer
with a little modification, because in SUBSTRACT table all records were appeared with -1 in NUM, even if no date was in END DATE = NULL.
use AdventureWorksDW2012 --using in MS SSMS for choosing DATABASE to work with
-- and may be not work in other platforms
select
t.thedate
,max(t.numActives) AS "Total Active Employees"
from (
select
dates.thedate
,SUM(dates.num) over (order by dates.thedate) as numActives
from
(
(
select
StartDate as thedate
,1 as num
from DimEmployee
)
union all
(
select
EndDate as thedate
,-1 as num
from DimEmployee
where EndDate IS NOT NULL
)
) AS dates
) AS t
group by thedate
ORDER BY thedate
worked for me, hope it will help somebody

I was able to get the results I was looking for with the following:
--Active Team Members by Date
SELECT "a_date",
COUNT(peo.EMPLOYEE_NUMBER) AS "CT"
FROM hr.per_all_people_f peo,
(SELECT DATE '2012-04-01'-1 + LEVEL AS "a_date"
FROM dual
CONNECT BY LEVEL <= DATE '2012-04-30'+2 - DATE '2012-04-01'-1
)
WHERE peo.current_employee_flag = 'Y'
AND "a_date" BETWEEN peo.effective_start_date AND peo.EFFECTIVE_END_DATE
GROUP BY "a_date"
ORDER BY "a_date"

Related

SQL Union as Subquery to create Date Ranges from Start Date

I have three tabels, each of them has a date column (the date column is an INT field and needs to stay that way). I need a UNION accross all three tables so that I get the list of unique dates in accending order like this:
20040602
20051215
20060628
20100224
20100228
20100422
20100512
20100615
Then I need to add a column to the result of the query where I subtract one from each date and place it one row above as the end date. Basically I need to generate the end date from the start date somehow and this is what I got so far (not working):
With Query1 As (
Select date_one As StartDate
From table_one
Union
Select date_two As StartDate
From table_two
Union
Select date_three e As StartDate
From table_three
Order By Date Asc
)
Select Query1.StartDate - 1 As EndDate
From Query1
Thanks a lot for your help!
Building on your existing union cte, we can use lead() in the outer query to get the start_date of the next record, and withdraw 1 from it.
with q as (
select date_one start_date from table_one
union select date_two from table_two
union select date_three from table_three
)
select
start_date,
dateadd(day, -1, lead(start_date) over(order by start_date)) end_date
from q
order by start_date
If the datatype the original columns are numeric, then you need to do some casting before applying date functions:
with q as (
select cast(cast(date_one as varchar(8)) as date) start_date from table_one
union select cast(cast(date_two as varchar(8)) as date) from table_two
union select cast(cast(date_three as varchar(8)) as date) from table_three
)
select
start_date,
dateadd(day, -1, lead(start_date) over(order by start_date)) end_date
from q
order by start_date

Finding the commencement date of a new project

Interested in a challenging SQL problem, read ahead:
For the data set below, I'm trying to find a logic which identifies the commencement date of a new project for each employee.
Data Set
The logic to identify commencement date of new project is that:
An employee will not have any date record prior to the present one in a 14 day time frame.
Project windows only last 14 days after the commencement. The first record falling outside such a window will be counted as the start of the next project.
What is needed
Both Redshift/ Postgres solutions accepted.
Please note Redshift doesn't support recursive CTEs or RANGE keyword in window frame.
Thanks for reading.
For Postgresql, including the CTE (DataSet) for the dataset, here you go:
WITH RECURSIVE TimeLine(Employee, ProjectID, ProjectStartDate, Date, DateRank) AS (
SELECT Employee, 1, Date, Date, DateRank
FROM DataSetWithRank
WHERE DateRank = 1
UNION ALL
SELECT T.Employee,
T.ProjectID + CASE When D.Date >= T.ProjectStartDate+14 THEN 1 Else 0 END,
CASE When D.Date >= T.ProjectStartDate+14 THEN D.Date Else T.ProjectStartDate END,
D.Date, D.DateRank
FROM TimeLine T
JOIN DataSetWithRank D ON D.Employee = T.Employee AND D.DateRank = T.DateRank + 1
), DataSet(Employee,Date) AS (
SELECT UNNEST(ARRAY['Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1','Employee1']),
UNNEST(ARRAY['2018-01-01','2018-01-03','2018-01-05','2018-01-08','2018-01-11','2018-01-13','2018-01-14','2018-01-16','2018-01-18','2018-01-21','2018-01-22','2018-01-24','2018-01-25','2018-01-27','2018-01-29']::date[])
UNION
SELECT UNNEST(ARRAY['Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2','Employee2']),
UNNEST(ARRAY['2018-01-03','2018-01-05','2018-01-07','2018-01-10','2018-01-13','2018-01-15','2018-01-16','2018-01-18','2018-01-20','2018-01-23','2018-01-24','2018-01-26','2018-01-27','2018-01-29','2018-01-31']::date[])
), DataSetWithRank AS (
SELECT *, DENSE_RANK() OVER (PARTITION BY Employee ORDER BY Date) AS DateRank
FROM DataSet
)
SELECT Employee,
'Project ' || ProjectID AS "Project #",
Date,
DENSE_RANK() OVER (PARTITION BY Employee, ProjectID ORDER BY Date) AS Rank,
CASE WHEN Date = ProjectStartDate THEN 'Y' ELSE NULL END AS Is_New
FROM TimeLine

How to fill missing dates between empty records?

I am trying to fill dates between empty records but without success. Tried to do multiple selects method, tried to join, but it seems like I am missing the point. I would like to generate records with missing dates, to generate chart from this block of code. Firstly I would like to have dates filled "manually", later I will reorganise this code and swap that method for an argument.
Can someone help me with that expression?
SELECT
LOG_LAST AS "data",
SUM(run_cnt) AS "Number of runs"
FROM
dual l
LEFT OUTER JOIN "LOG_STAT" stat ON
stat."LOG_LAST" = l."CLASS"
WHERE
new_class = '$arg[klasa]'
--SELECT to_date(TRUNC (SYSDATE - ROWNUM), 'DD-MM-YYYY'),
--0
--FROM dual CONNECT BY ROWNUM < 366
GROUP BY
LOG_LAST
ORDER BY
LOG_LAST
//Edit:
LOG_LAST is just a column with date (for example: 25.04.2018 15:44:21), run_cnt is a column with just a simple number, LOG_STAT is a table that contains LOG_LAST and run_cnt, new_class is a column with name of the record I would like to list records even when they are no existing. For example: I have a records with date 24-09-2018, 23-09-2018, 20-09-2018, 18-09-2018, and I would like to list records even without names and run_cnt, but to generate missing dates in some period
try to fill with isnull:
SELECT
case when trim(LOG_LAST) is null then '01-01-2018'
else isnull(LOG_LAST,'01-01-2018')end AS data,
SUM(isnull(run_cnt,0)) AS "Number of runs"
FROM
dual l
LEFT OUTER JOIN "LOG_STAT" stat ON
stat."LOG_LAST" = l."CLASS"
WHERE
new_class = '$arg[klasa]'
--SELECT to_date(TRUNC (SYSDATE - ROWNUM), 'DD-MM-YYYY'),
--0
--FROM dual CONNECT BY ROWNUM < 366
GROUP BY
LOG_LAST
ORDER BY
LOG_LAST
What you want is more or less:
select d.day, sum(ls.run_cnt)
from all_dates d
left join log_stat ls on trunc(ls.log_last) = d.day
where ls.new_class = :klasa
group by d.day
order by d.day;
The all_dates table in above query is supposed to contain all dates beginning with the minimum klasa log_last date and ending with the maximum klasa log_last date. You get these dates with a recursive query.
with ls as
(
select trunc(log_last) as day, sum(run_cnt) as total
from log_stat
where new_class = :klasa
group by trunc(log_last)
)
, all_dates(day) as
(
select min(day) from ls
union all
select day + 1 from all_dates where day < (select max(day) from ls)
)
select d.day, ls.total
from all_dates d
left join ls on ls.day = d.day
order by d.day;
It's called data densification. From oracle doc Data Densification for Reporting, An example data densification
with ls as
(
select trunc(created) as day,object_type new_class, sum(1) as total
from user_objects
group by trunc(created),object_type
)
, all_dates(day) as
(
select min(day) from ls
union all
select day + 1 from all_dates where day < (select max(day) from ls)
)
select d.day, nvl(ls.total,0),new_class
from all_dates d
left join ls partition by (ls.new_class) on ls.day = d.day
order by d.day;

How can I sum values per day and then plot them on calendar from start date to last date

I have a table, part of which is given below. It contain multiple values (durations) per day. I need two things 1) addition of durations per day. 2) plotting them on calendar in such a way that startdate is first_date from the table and last_date is Last_update from the table. I want to mention 0 for which date there is no duration. I think it will something like below but need help.
;WITH AllDates AS(
SELECT #Fromdate As TheDate
UNION ALL
SELECT TheDate + 1
FROM AllDates
WHERE TheDate + 1 <= #ToDate
)SELECT UserId,
TheDate,
COALESCE(
SUM(
-- When the game starts and ends in the same date
CASE WHEN DATEDIFF(DAY, GameStartTime, GameEndTime) = 0
Here is what I am looking for
Another way to generate the date range you are after would be something like .....
;WITH DateLimits AS
(
SELECT MIN(First_Date) FirstDate
,MAX(Last_Update) LastDate
FROM TableName
),
DateRange AS
(
SELECT TOP (SELECT DATEDIFF(DAY,FirstDate,LastDate ) FROM DateLimits)
DATEADD(DAY
,ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
, (SELECT FirstDate FROM DateLimits)
) AS Dates
FROM master..spt_values a cross join master..spt_values b
)
SELECT * FROM DateRange --<-- you have the desired date range here
-- other query whatever you need.

SQL Server: Attempting to output a count with a date

I am trying to write a statement and just a bit puzzled what is the best way to put it together. So I am doing a UNION on a number of tables and then from there I want to produce as the output a count for the UserID within that day.
So I will have numerous tables union such as:
Order ID, USERID, DATE, Task Completed.
UNION
Order ID, USERID, DATE, Task Completed
etc
Above is layout of the table which will have 4 tables union together with same names.
Then statement output I want is for a count of USERID that occurred within the last 24 hours.
So output should be:
USERID--- COUNT OUTPUT-- DATE
I was attempting a WHERE statement but think the output is not what I am after exactly, just thinking if anyone can point me in the right direction and if there is alternative way compared to the union? Maybe a joint could be a better alternative, any help be appreciated.
I will eventually then put this into a SSRS report, so it gets updated daily.
You can try this:
select USERID, count(*) as [COUNT], cast(DATE as date) as [DATE]
from
(select USERID, DATE From SomeTable1
union all
select USERID, DATE From SomeTable2
....
) t
where DATE <= GETDATE() AND DATE >= DATEADD(hh, -24, GETDATE())
group by USERID, cast(DATE as date)
First, you should use union all rather than union. Second, you need to aggregate and use count distinct to get what you want:
So, the query you want is something like:
select count(distinct userid)
from ((select date, userid
from table1
where date >= '2015-05-26'
) union all
(select date, userid
from table2
where date >= '2015-05-26'
) union all
(select date, userid
from table3
where date >= '2015-05-26'
)
) du
Note that this hardcodes the date. In SQL Server, you would do something like:
date >= cast(getdate() - 1 as date)
And in MySQL
date >= date_sub(curdate(), interval 1 day)
EDIT:
I read the question as wanting a single day. It is easy enough to extend to all days:
select cast(date as date) as dte, count(distinct userid)
from ((select date, userid
from table1
) union all
(select date, userid
from table2
) union all
(select date, userid
from table3
)
) du
group by cast(date as date)
order by dte;
For even more readability, you could use a CTE:
;WITH cte_CTEName AS(
SELECT UserID, Date, [Task Completed] FROM Table1
UNION
SELECT UserID, Date, [Task Completed] FROM Table2
etc
)
SELECT COUNT(UserID) AS [Count] FROM cte_CTEName
WHERE Date <= GETDATE() AND Date >= DATEADD(hh, -24, GETDATE())
I think this is what you are trying to achieve...
Select
UserID,
Date,
Count(1)
from
(Select *
from table1
Union All
Select *
from table2
Union All
Select *
from table3
Union All
Select *
from table4
) a
Group by
Userid,
Date