SQL - Count number of itemactive within a date rate - sql

I have a dataset of Resources, Projects, StartDate and EndDate.
Each Resource can be utilised by multiple projects.
I want to get a count of the number of projects that are using a resource in each quarter.
So if project starts in Q1 of a particular year and ends in Q3 that year, and project2 starts in Q2 and ends in Q3, I want to get a count of 2 projects for Q2, since during Q1 both project1 and project2 were active.
Here is my dataset:
create table Projects
(Resource_Name varchar(20)
,Project_Name varchar(20)
,StartDate varchar(20)
,EndDate varchar(20)
)
insert into Projects values('Resource 1','Project A','15/01/2013','1/11/2014')
insert into Projects values('Resource 1','Project B','1/03/2013','1/09/2016')
insert into Projects values('Resource 1','Project C','1/04/2013','1/09/2015')
insert into Projects values('Resource 1','Project D','1/06/2013','1/03/2016')
insert into Projects values('Resource 1','Project E','15/01/2013','1/09/2015')
insert into Projects values('Resource 1','Project F','3/06/2013','1/11/2015')
And here is the result I'm looking for:
Resource Name| Year | Quarter|Active Projects
Resource 1 2013 1 2
Resource 1 2013 2 6

Using tally table:
Using the dates from Projects, generate a list of all quarters and their start dates and end dates, in this example, that is CteQuarter(sd, ed). After that, you simply need to JOIN the Projects table to CteQuarter for overlapping dates. Then finally, GROUP BY using the YEAR and Quarter part of the date.
SQL Fiddle
WITH CteYear(yr) AS(
SELECT number
FROM master..spt_values
WHERE
type = 'P'
AND number >= (SELECT MIN(YEAR(CONVERT(DATE, StartDate, 103))) FROM Projects)
AND number <= (SELECT MAX(YEAR(CONVERT(DATE, EndDate, 103))) FROM Projects)
),
CteQuarter(sd, ed) AS(
SELECT
DATEADD(QUARTER, q.n - 1, DATEADD(YEAR, cy.yr - 1900, 0)),
DATEADD(DAY, -1, DATEADD(QUARTER, q.n, DATEADD(YEAR, cy.yr - 1900, 0)))
FROM CteYear AS cy
CROSS JOIN(
SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
) AS q(n)
)
SELECT
p.Resource_Name,
[Year] = DATEPART(YEAR, q.sd),
[Quarter] = DATEPART(QUARTER, q.sd),
[Active Projects] = COUNT(*)
FROM Projects p
INNER JOIN CteQuarter q
ON CONVERT(DATE, StartDate, 103) <= q.ed
AND CONVERT(DATE, EndDate, 103) >= q.sd
GROUP BY
p.Resource_Name,
DATEPART(YEAR, q.sd),
DATEPART(QUARTER, q.sd)
ORDER BY
p.Resource_Name,
DATEPART(YEAR, q.sd),
DATEPART(QUARTER, q.sd)
Notes:
Here is a great way to check for overlapping dates.
Some common date routines
RESULT:
| Resource_Name | Year | Quarter | Active Projects |
|---------------|------|---------|-----------------|
| Resource 1 | 2013 | 1 | 3 |
| Resource 1 | 2013 | 2 | 6 |
| Resource 1 | 2013 | 3 | 6 |
| Resource 1 | 2013 | 4 | 6 |
| Resource 1 | 2014 | 1 | 6 |
| Resource 1 | 2014 | 2 | 6 |
| Resource 1 | 2014 | 3 | 6 |
| Resource 1 | 2014 | 4 | 6 |
| Resource 1 | 2015 | 1 | 5 |
| Resource 1 | 2015 | 2 | 5 |
| Resource 1 | 2015 | 3 | 5 |
| Resource 1 | 2015 | 4 | 3 |
| Resource 1 | 2016 | 1 | 2 |
| Resource 1 | 2016 | 2 | 1 |
| Resource 1 | 2016 | 3 | 1 |

You can do this by determining the first and last quarters when a project is active, and then using cumulative sum. In SQL Server 2012+, this looks like
select resource_name, yyyyq,
(sum(sum(s)) over (partition by resource_name order by yyyyq) -
sum(sum(e)) over (partition by resource_name order by yyyyq) +
e
) as activeProjects
from ((select resource_name, datepart(year, startdate) + datepart(quarter, startdate) as yyyyq, 1 as s, 0 as e
from projects
) union all
(select resource_name, datepart(year, enddate) + datepart(quarter, enddate), 0 as s, 1 as e
from projects
)
) yq
group by resource_name, yyyyq;
In earlier versions, you can do something similar with cross apply.

Related

SQL Query to apply a command to multiple rows

I am new to SQL and trying to write a statement similar to a 'for loop' in other languages and am stuck. I want to filter out rows of the table where for all of attribute 1, attribute2=attribute3 without using functions.
For example:
| Year | Month | Day|
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 4 | 4 |
| 2 | 3 | 4 |
| 2 | 3 | 3 |
| 2 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
| 3 | 4 | 4 |
I would only want the row
| Year | Month | Day|
|:---- |:------:| -----:|
| 3 | 4 | 4 |
because it is the only where month and day are equal for all of the values of year they share.
So far I have
select year, month, day from dates
where month=day
but unsure how to apply the constraint for all of year
-- month/day need to appear in aggregate functions (since they are not in the GROUP BY clause),
-- but the HAVING clause ensure we only have 1 month/day value (per year) here, so MIN/AVG/SUM/... would all work too
SELECT year, MAX(month), MAX(day)
FROM my_table
GROUP BY year
HAVING COUNT(DISTINCT (month, day)) = 1;
year
max
max
3
4
4
View on DB Fiddle
So one way would be
select distinct [year], [month], [day]
from [Table] t
where [month]=[day]
and not exists (
select * from [Table] x
where t.[year]=x.[year] and t.[month] <> x.[month] and t.[day] <> x.[day]
)
And another way would be
select distinct [year], [month], [day] from (
select *,
Lead([month],1) over(partition by [year] order by [month])m2,
Lead([day],1) over(partition by [year] order by [day])d2
from [table]
)x
where [month]=m2 and [day]=d2

Creating and populating row in a table for missing dates after each first valid entry

I have a dataset that represents number of views on individual files per day.
I would like to import this data into some visualization tool and show how many views a file received each day, beginning with the first valid date with an entry, in the form of something like a bar graph.
For example, I have a table like this:
+-----------+-----------+----------------+----------------+--------------+------------------+-----------------+------------------+
| Metadata1 | Metadata2 | Unique_Item_ID | Item_ID | Unique Views | Total View Count | Start_Date | End_Date |
+-----------+-----------+----------------+----------------+--------------+------------------+-----------------+------------------+
| Folder1 | Subf1 | {000dda83} | Document.docx | 6 | 11 | 11/27/2019 0:00 | 11/27/2019 23:59 |
| Folder2 | Sub2f | {004120b6} | Reporting.mp4 | 3 | 10 | 11/8/2019 0:00 | 11/8/2019 23:59 |
| Folder2 | Sub2f | {004120b6} | Reporting.mp4 | 8 | 13 | 11/20/2019 0:00 | 11/20/2019 23:59 |
| Folder2 | Sub2f | {004120b6} | Reporting.mp4 | 12 | 27 | 11/29/2019 0:00 | 11/29/2019 23:59 |
| Folder3 | Sub3f | {004f9957} | Case Study.pdf | 1 | 1 | 10/8/2019 0:00 | 10/8/2019 23:59 |
+-----------+-----------+----------------+----------------+--------------+------------------+-----------------+------------------+
From a query like:
SELECT
TOP 5 [Metadata1],
[Metadata2],
[Unique_Item_ID],
[Item_ID],
[Unique Views],
[Total View Count],
[Start_Date],
[End_Date]
FROM
DailyViewStats
How can I create a view that will create and populate rows with 0 view count for each Unique_Item_ID that does not exist, but only after the first occurrence of a valid existing row for each distinct Unique_Item_ID?
I know that I can use a partition function to identify the first valid row for each Unique_Item_ID, but I'm not sure how to leverage this. I tried using a cross join on all distinct Start_Dates in the table, to match up with all the unique items and their metadata, but I was unable to determine a WHERE statement that effectively removed any entry before the first valid one per Unique_Item_ID.
Using
ROW_NUMBER() OVER (PARTITION BY Unique_Item_ID ORDER BY Start_Date ASC) as RowNum
I believe I can use this to identify the minimum dates I need when RowNum = 1. But how do I use this?
If today were 11/29, for Document.docx, I want to see something like this:
+-----------+-----------+----------------+---------------+--------------+------------------+-----------------+------------------+
| Metadata1 | Metadata2 | Unique_Item_ID | Item_ID | Unique Views | Total View Count | Start_Date | End_Date |
+-----------+-----------+----------------+---------------+--------------+------------------+-----------------+------------------+
| Folder1 | Subf1 | {000dda83} | Document.docx | 6 | 11 | 11/27/2019 0:00 | 11/27/2019 23:59 |
| Folder1 | Subf1 | {000dda83} | Document.docx | 0 | 0 | 11/28/2019 0:00 | 11/28/2019 23:59 |
| Folder1 | Subf1 | {000dda83} | Document.docx | 0 | 0 | 11/29/2019 0:00 | 11/29/2019 23:59 |
+-----------+-----------+----------------+---------------+--------------+------------------+-----------------+------------------+
For each file existing in the table.
One direct way to do this is to employ a calendar table. In the example below the calendar is provided by a recursive CTE with a ~37K range of days. Once that is set up you want to overlay each of the unique Id's with each day. This is done below in the form of a cross join CTE, only including the keys. From the derived cross join table, simply LEFT JOIN the bulk of your data and the values will appear aligned with each day of the calendar. I took the liberty of simplifying your model below.
DECLARE #T TABLE( Unique_Item_ID NVARCHAR(50), Total_View_Count INT, DateViewed DATETIME)
INSERT #T VALUES
('000dda83',11, '11/27/2019'),
('004120b6',10, '11/8/2019'),
('004120b6',13, '11/20/2019')
DECLARE #StartDate DATETIME = '10/01/2019'
DECLARE #EndDate DATETIME = '01/01/2020'
;WITH OrderedDays as
(
SELECT CalendarDate = #StartDate
UNION ALL
SELECT CalendarDate = DATEADD(DAY, 1, CalendarDate)
FROM OrderedDays WHERE DATEADD (DAY, 1, CalendarDate) <= #EndDate
),
Calendar AS
(
SELECT
DayIndex = ROW_NUMBER() OVER(PARTITION BY 1 ORDER BY CalendarDate),
CalendarDate,
CalenderDayOfMonth = DATEPART(DAY, CalendarDate),
CalenderMonthOfYear = DATEPART(MONTH, CalendarDate),
CalendarYear = DATEPART(YEAR, CalendarDate),
CalenderWeekOfYear = DATEPART(WEEK, CalendarDate),
CalenderQuarterOfYear = DATEPART(QUARTER, CalendarDate),
CalenderDayOfYear = DATEPART(DAYOFYEAR, CalendarDate),
CalenderDayOfWeek = DATEPART(WEEKDAY, CalendarDate),
CalenderWeekday = DATENAME(WEEKDAY, CalendarDate)
FROM
OrderedDays
)
,CrossJoinData AS
(
SELECT Unique_Item_ID, CalendarDate
FROM
Calendar C
CROSS JOIN #T T
GROUP BY
Unique_Item_ID, CalendarDate
HAVING
MIN(T.DateViewed) <= C.CalendarDate
)
SELECT
CJ.Unique_Item_ID,
CJ.CalendarDate,
T.Total_View_Count
FROM
CrossJoinData CJ
LEFT OUTER JOIN #T T ON T.Unique_Item_ID = CJ.Unique_Item_ID AND T.DateViewed = CJ.CalendarDate
ORDER BY
CJ.Unique_Item_ID,
CJ.CalendarDate
OPTION (MAXRECURSION 0)

All years and months between 2 dates SQL

I have a little question I have a table called project that looks like this:
---------------------------------------
ProjectId | StartDate | EndDate |
---------------------------------------
1 | 01/01/2015 | 31/12/2017|
Is it posible to get all months and years between those dates like this:
--------------------
| Month | Year |
--------------------
1 | 2015 |
2 | 2015 |
3 | 2015 |
4 | 2015 |
5 | 2015 |
6 | 2015 |
7 | 2015 |
8 | 2015 |
9 | 2015 |
10 | 2015 |
11 | 2015 |
12 | 2015 |
1 | 2016 |
2 | 2016 |
3 | 2016 |
4 | 2016 |
. | . |
. | . |
. | . |
12 | 2017 |
Here's a method using PostgreSQL functions generate_series and extract:
SELECT extract(month FROM date) AS month, extract(year FROM date) AS year
FROM (
SELECT generate_series('2015-01-01'::date, '2017-12-31'::date, '1 month'::interval) AS date
) AS date_range
https://www.postgresql.org/docs/current/static/functions-srf.html
https://www.postgresql.org/docs/current/static/functions-datetime.html
You'd need to modify this to use the dates from your table:
SELECT extract(month FROM range) AS month, extract(year FROM range) AS year
FROM (
SELECT generate_series(StartDate, EndDate, '1 month'::interval) AS range
FROM project
WHERE ProjectId = 1
) AS date_range
If your database is sql server, you can run the following code to get the result.
DECLARE #DateStart DATETIME = '2015-01-01'
DECLARE #DateEnd DATETIME = ' 2017-12-31';
WITH Dates AS
(
SELECT DATEADD(DAY, -(DAY(#DateStart) - 1), #DateStart) AS [Date]
UNION ALL
SELECT DATEADD(MONTH, 1, [Date])
FROM Dates
WHERE [Date] < DATEADD(DAY, -(DAY(#DateEnd) - 1), #DateEnd)
)
SELECT
MONTH([Date]) AS [Month],
YEAR([Date]) AS [Year]
FROM Dates;
Hope it will help. If you need more help, you can look at the following link
https://blog.sqlauthority.com/2014/12/22/sql-server-list-the-name-of-the-months-between-date-ranges-part-2/
In Sql server, query for your date type
;with datecte (date)
AS
(
SELECT Convert(date,'01/01/2015',105)
UNION ALL
SELECT DATEADD(month,1,date)
from datecte
where DATEADD(month,1,date)<= (Select Convert(date,'31/12/2017',105))
)
select month(date),YEAR(date) from datecte

Get list of counts by date

I have two tables. One containing a list of applications. And another one containing counts associated to them every week. Now I want to get as a result the app name and the count for this week and the previous. Let me explain this.
app:
+----+-------------+
| id | name |
+----+-------------+
| 1 | Office 2007 |
+----+-------------+
| 2 | Office 2010 |
+----+-------------+
| 3 | Office 2013 |
+----+-------------+
count:
+----+--------+-------+------------+
| id | app_id | count | date |
+----+--------+-------+------------+
| 1 | 1 | 200 | 2016-01-11 |
+----+--------+-------+------------+
| 2 | 2 | 500 | 2016-01-11 |
+----+--------+-------+------------+
| 3 | 3 | 750 | 2016-01-11 |
+----+--------+-------+------------+
| 4 | 1 | 180 | 2016-01-18 |
+----+--------+-------+------------+
| 5 | 2 | 378 | 2016-01-18 |
+----+--------+-------+------------+
| 6 | 3 | 1000 | 2016-01-18 |
+----+--------+-------+------------+
And this is the result I need. I need all the applications with the count of this week and the previous:
+-------------+-----------------+-----------------+
| app | count_this_week | count_prev_week |
+-------------+-----------------+-----------------+
| Office 2007 | 180 | 200 |
+-------------+-----------------+-----------------+
| Office 2010 | 378 | 500 |
+-------------+-----------------+-----------------+
| Office 2013 | 1000 | 750 |
+-------------+-----------------+-----------------+
A script runs every week which fills the count table. And now I need to get a report also on a weekly basis.
Honestly I'm a bit lost as I don't know how to declare the conditions for the columns.
You can try to group first by DATEPART(WEEK,C.date),name and then split the counts into 2 columns using another GROUP BY. Something like this
EDIT
If there are exactly 1 record per week per app, you can do with just one group by like this.
SELECT
appname,
SUM(CASE WHEN weekno = 0 THEN sumcount ELSE 0 END) as thisweek,
SUM(CASE WHEN weekno = 1 THEN sumcount ELSE 0 END) as lastweek
FROM
(
SELECT
DATEPART(WEEK,CURRENT_TIMESTAMP) - DATEPART(WEEK,C.date) as weekno,
name as appname,
count as sumcount
FROM App A
INNER JOIN CountTable C ON A.[id] = C.[app_id]
WHERE DATEPART(WEEK,C.date) BETWEEN DATEPART(WEEK,CURRENT_TIMESTAMP) - 1 AND DATEPART(WEEK,CURRENT_TIMESTAMP)
)T
GROUP BY appname
Query
SELECT
appname,
SUM(CASE WHEN weekno = 0 THEN sumcount ELSE 0 END) as thisweek,
SUM(CASE WHEN weekno = 1 THEN sumcount ELSE 0 END) as lastweek
FROM
(
SELECT
DATEPART(WEEK,CURRENT_TIMESTAMP) - DATEPART(WEEK,C.date) as weekno,
name as appname,
SUM(count) as sumcount
FROM App A INNER JOIN CountTable C ON A.[id] = C.[app_id]
WHERE DATEPART(WEEK,C.date) BETWEEN DATEPART(WEEK,CURRENT_TIMESTAMP) - 1 AND DATEPART(WEEK,CURRENT_TIMESTAMP)
GROUP BY DATEPART(WEEK,C.date),name
) AS T
GROUP BY appname
SQL Fiddle
Output
| appname | thisweek | lastweek |
|-------------|----------|----------|
| Office 2007 | 180 | 200 |
| Office 2010 | 378 | 500 |
| Office 2013 | 1000 | 750 |
You can use this generic query with a variable for the current week day:
DECLARE #week date = '2016-01-18';
WITH data AS (
SELECT a.name, c.[count]
, w = CASE WHEN c.[date] = #week THEN 0 ELSE 1 END
FROM #Counts c
INNER JOIN #Apps a ON c.app_id = a.id
WHERE [date] = #week OR [date] = DATEADD(day, -7, #week)
)
SELECT App = name, count_this_week = [0], count_prev_week = [1]
FROM data d
PIVOT (
MAX([count])
FOR w IN ([0], [1])
) p
Output:
App count_this_week count_prev_week
Office 2007 180 200
Office 2010 378 500
Office 2013 1000 750
Your data:
DECLARE #Apps TABLE ([id] int, [name] varchar(11));
DECLARE #Counts TABLE([id] int, [app_id] int, [count] int, [date] date);
INSERT INTO #Apps([id], [name])
VALUES
(1, 'Office 2007'),
(2, 'Office 2010'),
(3, 'Office 2013')
;
INSERT INTO #Counts([id], [app_id], [count], [date])
VALUES
(1, 1, 200, '2016-01-11'),
(2, 2, 500, '2016-01-11'),
(3, 3, 750, '2016-01-11'),
(4, 1, 180, '2016-01-18'),
(5, 2, 378, '2016-01-18'),
(6, 3, 1000, '2016-01-18')
;
SELECT *
FROM count
JOIN app ON app.id=count.app_id
WHERE date BETWEEN '2016-01-18' AND '2016-01-11'

SQL Days before end of the month

i have got table with transactions, looking like:
+----+--------------+----------------+------+
| ID | OrderDate | DeliveryDate | EUR |
+----+--------------+----------------+------+
| 1 | 2015-02-21 | 2015-02-25 | 100 |
| 2 | 2015-03-01 | 2015-03-14 | 110 |
| 3 | 2015-03-01 | 2015-03-17 | 90 |
| 4 | 2015-03-10 | 2015-03-20 | 250 |
| 5 | 2015-03-31 | 2015-03-31 | 350 |
+----+--------------+----------------+------+
ANd I need to get sum of revenue and number of orders (COUNT of IDs) based on Days before the end of the month when order gets delivered.
SELECT datediff(day, OrderDate, CAST(DATEADD(month, DATEDIFF(month,0,getdate()+1,0)-1) as Date) as DBEOM, SUM(EUR) as Rev, COUNT(ID) as NumberOfOrders
FROM transactions
WHERE MONTH(DeliveryDate) = 3 AND YEAR(DeliveryDate) = 2015
GROUP BY datediff(day, OrderDate, CAST(DATEADD(month, DATEDIFF(month,0,getdate()+1,0)-1) as Date) as DBEOM
ORDER BY 1
The result in this case would be like:
+-----+-----+----------------+
|DBEOM| Rev | NumberOfOrders |
+-----+-----+----------------+
| 0 | 350 | 1 |
| 21 | 250 | 1 |
| 30 | 200 | 2 |
+-----+-----+----------------+
This is done in SQL 2008, so I can't simply use EOMONTH. I have tried, what is above, but i am getting
ERROR -
[Microsoft][ODBC SQL Server Driver][SQL Server]The datediff function
requires 3 argument(s).
Many thanks in advance for advice!
The easiest way I've found get the last day of the month with more primitive functions is to get the first day of the next month and then subtract a day.
I'm not a TSQL guy so this syntax likely won't be correct but you need something more like
DATEADD(day, DATEFROMPARTS(DATEPART(year, DATEADD(month,1,getdate()), DATEPART(month, DATEADD(month,1,getdate()), 1), -1)
Try:
SELECT datediff(day,
OrderDate,
dateadd(DAY,
-1,
dateadd(MONTH,
1,
dateadd(DAY,
1-day(DeliveryDate),
DeliveryDate
)
)
)
) as DBEOM, SUM(EUR) as Rev, COUNT(ID) as NumberOfOrders
FROM t
WHERE MONTH(DeliveryDate) = 3 AND YEAR(DeliveryDate) = 2015
GROUP BY datediff(day,
OrderDate,
dateadd(DAY,
-1,
dateadd(MONTH,
1,
dateadd(DAY,
1-day(DeliveryDate),
DeliveryDate
)
)
)
)
ORDER BY 1
sqlfiddle.com