I have a few fields in a database that look like this:
trip_id
start_date
end_date
start_station_name
end_station_name
I need to write a query that shows all the stations with no activity on a particular day in the year 2015. I wrote the following query but it's not giving the right output:
select
start_station_name,
extract(date from start_date) as dt,
count(*)
from
trips_table
where
(
start_date >= timestamp('2015-01-01')
and
start_date < timestamp('2016-01-01')
)
group by
start_station_name,
dt
order by
count(*)
Can someone help come up with the right query? Thanks in advance!
Below is for BigQuery Standard SQL
It assumes start_date and end_date are of DATE type
It also assumes that all days in between start_date and end_date are "dedicated" to station in start_station_name field, which most likely not what is expected but question is missing details here thus such an assumption
#standardSQL
WITH days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
You can test/play it with below simple/dummy data
#standardSQL
WITH `trips_table` AS (
SELECT 1 AS trip_id, DATE '2015-01-01' AS start_date, DATE '2015-12-01' AS end_date, '111' AS start_station_name UNION ALL
SELECT 2, DATE '2015-12-10', DATE '2015-12-31', '111'
),
days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
ORDER BY station, day
the output is like below
station day
111 2015-12-02
111 2015-12-03
111 2015-12-04
111 2015-12-05
111 2015-12-06
111 2015-12-07
111 2015-12-08
111 2015-12-09
Use recursion for this purpose: try this SQL SERVER
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
Where CAST(sample.dt as date) NOT IN (
SELECT CAST(start_date as date)
FROM tablename
WHERE start_date >= '2015-01-01 00:00:00'
AND start_date < '2016-01-01 00:00:00'
)
Option(maxrecursion 0)
If you want the station data with it then you can use left join as :
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
left join tablename
on CAST(sample.dt as date) = CAST(tablename.start_date as date)
where sample.dt>= '2015-01-01 00:00:00' and sample.dt< '2016-01-01 00:00:00' )
Option(maxrecursion 0)
For mysql, see this fiddle. I think this would help you....
SQL Fiddle Demo
I have variable called WeekBeginDate and I want only to pull data for that week. For example, if the beginning of the week date is 07/21/2014 which is Monday in this case, then I want only to pull the data from 07/21/2014 to 7/27/2014.
The variable will always contain the date for the beginning of the week only but I don’t have the date for the end of the week.
The week begins on Monday and ends on Sunday. I can’t figure out how to calculate or sum the number of hours if I only have the date for the beginning of week.
SELECT DT, sum (TOT_HOURS)as TOT_HOURS FROM MYTABLE where DT >= #WeekBeginDate and <=#WeekEndDate group by DT
Note, that I only have the variable for the WeekBeginDate.
just modify your table columns in this CTE it may works :
;WITH workhours AS
(
SELECT DATEADD(DAY
, -(DATEPART(dw, DT) -1)
, DT) AS week_start
, DATEADD(DAY
, 7 - (DATEPART(dw, DT))
, DT) AS week_end
FROM MYTABLE
)
SELECT week_start
, week_end
, SUM(TOT_HOURS) total_hrs_per_week
FROM workhours
GROUP BY week_start
, week_end
You may need to add 6 days to the beginning of the week
and group by something else if you need total weekly hours, i'm calling it "id".
not by dt (or don't group at all if it is a total for the whole table):
SELECT id, DT, sum (TOT_HOURS)as TOT_HOURS FROM MYTABLE
where DT BETWEEN #WeekBeginDate and DATEADD(d,6,#WeekBeginDate)
GROUP BY id
This should be of some use to you. I am casting to date so the 24 hrs of day is considered.
DECLARE #WeekBeginDate DATETIME
SET #WeekBeginDate = '2014-07-28 12:08:31.633';
WITH MYTABLE (DT,TOT_HOURS)
AS (
SELECT '2014-06-27 00:08:31.633',5 UNION ALL
SELECT '2014-07-27 00:08:31.633',5 UNION ALL
SELECT '2014-07-28 00:08:31.633',1 UNION ALL
SELECT '2014-07-29 00:08:31.633',1 UNION ALL
SELECT '2014-07-30 00:08:31.633',1 UNION ALL
SELECT '2014-07-31 00:08:31.633',1 UNION ALL
SELECT '2014-08-01 00:08:31.633',1 UNION ALL
SELECT '2014-08-02 00:08:31.633',1 UNION ALL
SELECT '2014-08-03 00:08:31.633',1
)
SELECT CAST(#WeekBeginDate AS DATE) AS StartDate,
DATEADD(d, 6, CAST(#WeekBeginDate AS DATE)) AS EndDate,
SUM (TOT_HOURS)AS TOT_HOURS
FROM MYTABLE
WHERE CAST(DT AS DATE) BETWEEN CAST(#WeekBeginDate AS DATE) AND DATEADD(d, 6, CAST(#WeekBeginDate AS DATE))
Just add 6 (or 7) days...
SELECT DT, sum (TOT_HOURS)as TOT_HOURS FROM MYTABLE
where DT BETWEEN #WeekBeginDate and #WeekBeginDate + 6 group by DT
select #weekBeginDate = DATEADD(wk, DATEDIFF(wk,0,GETDATE()), 0)
select #WeekEndDate = DATEADD(dd, 6, DATEADD(wk, DATEDIFF(wk,0,GETDATE()), 0))
SELECT DT, sum (TOT_HOURS)as TOT_HOURS FROM MYTABLE where DT >= #WeekBeginDate and <=#WeekEndDate group by DT
Here is where having a calendar table would be very useful,
especially if your logic needs to change if Monday is a holiday.
Basically create a table with pre-calculated values for weeks and just join to it.
http://www.made2mentor.com/2011/04/calendar-tables-why-you-need-one
I have 2 date columns called Start_date and End_date in my table. I need to first find out how many weeks are in between those 2 dates and split the data.
--For e.g. if data is as given below,
ID Start_date End_date No_Of_Weeks
1 25-Apr-11 8-May-11 2
2 23-Apr-11 27-May-11 6
--I need the result like this:
ID Start_date End_date
1 25-Apr-2011 01-May-2011
1 02-May-2011 08-May-2011
2 23-Apr-2011 24-Apr-2011
2 25-Apr-2011 01-Apr-2011
2 02-May-2011 08-May-2011
2 09-May-2011 15-May-2011
2 16-May-2011 22-May-2011
2 23-May-2011 27-May-2011
Please help me out with the query. My week start date is Monday.
You can use a Calendar table defining then weeks and join it to your data.
I've created a sql fiddle for the following:
CREATE TABLE Calendar_Weeks (
week_start_date date,
week_end_date date )
CREATE TABLE Sample_Data (
id int,
start_date date,
end_date date )
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-04-18','2011-04-24')
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-04-25','2011-05-01')
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-05-02','2011-05-08')
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-05-09','2011-05-15')
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-05-16','2011-05-22')
INSERT Calendar_Weeks (week_start_date, week_end_date) VALUES ('2011-05-23','2011-05-29')
INSERT Sample_Data (id, start_date, end_date) VALUES (1, '2011-04-25','2011-05-08')
INSERT Sample_Data (id, start_date, end_date) VALUES (2, '2011-04-23','2011-05-27')
SELECT id, week_start_date, week_end_date
FROM Sample_Data CROSS JOIN Calendar_Weeks
WHERE week_start_date BETWEEN start_date AND end_date
UNION
SELECT id, week_start_date, week_end_date
FROM Sample_Data CROSS JOIN Calendar_Weeks
WHERE week_end_date BETWEEN start_date AND end_date
I have to admit the UNION of the queries feels a bit of a hack to include rows at the start or end of the set, so you might prefer to use Ravi Singh's solution.
You can also use INNER JOIN if you like:
SELECT id, week_start_date, week_end_date
FROM Sample_Data INNER JOIN Calendar_Weeks
ON week_start_date BETWEEN start_date AND end_date
UNION
SELECT id, week_start_date, week_end_date
FROM Sample_Data INNER JOIN Calendar_Weeks
ON week_end_date BETWEEN start_date AND end_date
As per the last understanding, this will work :
with demo_cte as
(select id,
start_date,
dateadd(day,6,DATEADD(wk, DATEDIFF(wk,0,start_date), 0)) end_date,
end_date last_end_date,
no_of_weeks no_of_weeks from demo
union all
select id,dateadd(day,1,end_date),
dateadd(day,7,end_date),
last_end_date
,no_of_weeks-1 from demo_cte
where no_of_weeks-1>0)
select id, start_date,
case
when end_date<=last_end_date then end_date
else
last_end_date
end
end_date
from demo_cte order by id,no_of_weeks desc
SQL Fiddle
And if number of weeks is not available use this :
with demo_cte as
(select id,
start_date,
dateadd(day,6,DATEADD(wk, DATEDIFF(wk,0,start_date), 0)) end_date,
end_date last_end_date
--,no_of_weeks no_of_weeks
from demo
union all
select id,dateadd(day,1,end_date),
dateadd(day,7,end_date),
last_end_date
--,no_of_weeks-1
from demo_cte
where --no_of_weeks-1>0
dateadd(day,7,end_date)<=last_end_date
)
select id, start_date,
case
when end_date<=last_end_date then end_date
else
last_end_date
end
end_date
from demo_cte order by id,start_date
--,no_of_weeks desc
setting ambient test
declare #dt table (ID int,Start_date datetime,
End_date datetime,No_Of_Weeks int)
insert into #dt (ID,Start_date,End_date,No_Of_Weeks)
select 1, '25-Apr-11', '8-May-11', 2
union all
select 2, '23-Apr-11' , '27-MAy-11' , 6;
try this...
with cte as (select d.ID
,d.Start_date
,(select MIN([end]) from (values(d.End_date),(DATEADD(day,-1,DATEADD(week,DATEDIFF(week,0,d.Start_date)+1,0))))V([end])) as End_date
,d.End_date as end_of_period
from #dt d
union all select d.ID
,DATEADD(day,1,d.End_date) as Start_date
, case when d.end_of_period < DATEADD(week,1,d.End_date) then d.end_of_period else DATEADD(week,1,d.End_date) end as End_date
,d.end_of_period as end_of_period
from cte d
where end_of_period <> End_date
)
select ID
,cast(Start_date as DATE) Start_date
,cast(End_date as date) End_date
from cte
order by cte.ID,cte.Start_date
option(maxrecursion 0)
the resultset achieved...
ID Start_date End_date
1 2011-04-25 2011-05-01
1 2011-05-02 2011-05-08
2 2011-04-23 2011-04-24
2 2011-04-30 2011-05-01
2 2011-05-07 2011-05-08
2 2011-05-14 2011-05-15
2 2011-05-21 2011-05-22
2 2011-05-28 2011-05-27
try this query, hope it will work
If your week starts on Sunday, use below
set datefirst 7
declare #FromDate datetime = '20130110'
declare #ToDate datetime = '20130206'
select datepart(week, #ToDate) - datepart(week, #FromDate) + 1
If your week starts on Monday, use below
set datefirst 1
declare #FromDate datetime = '20100201'
declare #ToDate datetime = '20100228'
select datepart(week, #ToDate) - datepart(week, #FromDate) + 1
note: both query will yield to diffrent results, since their starting dates differ.
You should look at using the DATEDIFF function.
I'm not sure what you're asking for in the second part of your question, but once you've got the difference between the dates, you may want to have a look at using CASE on the result of your DATEDIFF.
You should use a date tally table similar to the type that Jeff Moden suggests in The "Numbers" or "Tally" Table: What it is and how it replaces a loop (required login).
A Tally table is nothing more than a table with a single column of very well indexed sequential numbers starting at 0 or 1 (mine start at 1) and going up to some number. The largest number in the Tally table should not be just some arbitrary choice. It should be based on what you think you'll use it for. I split VARCHAR(8000)'s with mine, so it has to be at least 8000 numbers. Since I occasionally need to generate 30 years of dates, I keep most of my production Tally tables at 11,000 or more which is more than 365.25 days times 30 years.
I started with Tony's SQL Fiddler but implemented a DateInformation table to be a little more generic. This could be something that you could reuse.
--Build Test Data, For production set the date
--range large enough to handle all cases.
CREATE TABLE DateInformation (
[Date] date,
WeekDayNumber int,
)
--From Tony
CREATE TABLE Sample_Data (
id int,
start_date date,
end_date date )
DECLARE #CurrentDate Date = '2010-12-27'
While #CurrentDate < '2014-12-31'
BEGIN
INSERT DateInformation VALUES (#CurrentDate,DatePart(dw,#CurrentDate))
SET #CurrentDate = DATEADD(DAY,1,#CurrentDate)
END
--From Tony
INSERT Sample_Data VALUES (1, '2011-04-25','2011-05-08')
INSERT Sample_Data VALUES (2, '2011-04-23','2011-05-27')
Here's the solution using CTE to join the sample data to the DateInformation table.
--Solution using CTE
with Week (WeekStart,WeekEnd) as
(
select d.Date
,dateadd(day,6,d.date) as WeekEnd
from DateInformation d
where d.WeekDayNumber = 2
)
select
s.ID
,case when s.Start_date > w.WeekStart then s.Start_Date
else w.WeekStart end as Start_Date
,case when s.End_Date < w.WeekEnd then s.End_Date
else w.WeekEnd end as End_Date
from Sample_Data s
join Week w on w.WeekStart > dateadd(day,-6,s.start_date)
and w.WeekEnd <= dateadd(day,6,s.end_date);
See solution in SQL Fiddle
set datefirst 1
GO
with cte as (
select ID, Start_date, End_date, Start_date as Week_Start_Date, (case when datepart(weekday, Start_date) = 7 then Start_Date else cast(null as datetime) end) as Week_End_Date, datepart(weekday, Start_date) as start_weekday, cast(0 as int) as week_id
from (
values (1, cast('25-Apr-2011' as datetime), cast('8-May-2011' as datetime)),
(2, cast('23-Apr-2011' as datetime), cast('27-May-2011' as datetime))
) t(ID, Start_date, End_date)
union all
select ID, Start_date, End_date, dateadd(day, 1, Week_Start_date) as Week_Start_Date, (case when start_weekday + 1 = 7 then dateadd(day, 1, Week_Start_date) else null end) as Week_End_date, (case when start_weekday = 7 then 1 else start_weekday + 1 end) as start_weekday, (case when start_weekday = 7 then week_id + 1 else week_id end) as week_id
from cte
where Week_Start_Date != End_date
)
select ID, min(Week_Start_Date), isnull(max(Week_End_Date), max(End_Date))
from cte
group by ID, Week_id
order by ID, 2
option (maxrecursion 0)
If you wanted to get the number of weeks, you could change the select after the cte to be:
select ID, Start_date, End_date, count(distinct week_id) as Number_Of_Weeks
from cte
group by ID, Start_date, End_date
option (maxrecursion 0)
Obviously to change the data used, the anchor (first part) of the cte where values() is being used would be changed.
This uses Monday as the first day of the week. To us a different day, change the set datefirst statement at the top - -- http://msdn.microsoft.com/en-gb/library/ms181598.aspx
Enjoy!
WITH D AS (
SELECT id
, start_date
, end_date
, start_date AS WEEK_START
, start_date + 7 - DATEPART(weekday,start_date) + 1
AS week_end
FROM DATA
), W AS (
SELECT id
, start_date
, end_date
, WEEK_START
, WEEK_END
FROM D
UNION ALL
SELECT id
, start_date
, end_date
, WEEK_END + 1 AS WEEK_START
, WEEK_END + 7 AS WEEK_END
FROM W
WHERE WEEK_END < END_DATE
)
SELECT ID
, WEEK_START AS START_DATE
, WEEK_END AS END_DATE
FROM W
ORDER BY 1, 2;
Here's a solution that uses the datepart function to account for the fact that your weeks start on Mondays:
with demo_normalized as
(
select id,
start_date,
(datepart(dw,start_date) + 5) % 7 as test,
dateadd(d,
0 - ((datepart(dw,start_date) + 5) % 7),
start_date
) as start_date_firstofweek,
dateadd(d,
6 - ((datepart(dw,start_date) + 5) % 7),
start_date
) as start_date_lastofweek,
end_date,
dateadd(d,
0 - ((datepart(dw,end_date) + 5) % 7),
end_date
) as end_date_firstofweek,
dateadd(d,
6 - ((datepart(dw,end_date) + 5) % 7),
end_date
) as end_date_lastofweek,
datediff(week,
dateadd(d,
0 - ((datepart(dw,start_date) + 5) % 7),
start_date
),
dateadd(d,
6 - ((datepart(dw,end_date) + 5) % 7),
end_date
)
) as no_of_weeks
from demo
),
demo_cte as
(
select
id,
dateadd(day,7,start_date_firstofweek) as start_date,
dateadd(day,7,start_date_lastofweek) as end_date,
end_date_firstofweek,
no_of_weeks
from demo_normalized
where no_of_weeks >= 3
UNION ALL select
id,
dateadd(day,7,start_date) as start_date,
dateadd(day,7,end_date) as end_date,
end_date_firstofweek,
no_of_weeks
from demo_cte
where
(dateadd(day,8,start_date) < end_date_firstofweek)
),
demo_union as
(
select id, start_date, end_date, no_of_weeks from demo_normalized where no_of_weeks = 1
union all
select id, start_date, start_date_lastofweek as end_date, no_of_weeks
from demo_normalized where no_of_weeks >= 2
union all
select id, start_date, end_date, no_of_weeks from demo_cte
union all
select id, end_date_firstofweek as start_date, end_date, no_of_weeks
from demo_normalized where no_of_weeks >= 2
)
select
d0.id,
d0.no_of_weeks,
convert(varchar, d0.start_date, 106) as start_date,
convert(varchar, d0.end_date, 106) as end_date
from demo_union d0
order by d0.id, d0.start_date
EDIT (added CTE for the weeks in between):
Here is the link to sqlfiddle.
Note: This solution requires no additional DDL - no additional entities have to be created and maintained. In short, it doesn't reinvent the Calendar.
All the calendar logic is contained in the query.
I have a bunch of queries that take data with a time stamp and spit out SUMS based the last few weeks, months, and year to date. It looks like this
Week1 Sum for most recent week
Week2 Sum for second most recent week
WeekN Sum for N most recent week
Jan-Dec Sum for January-December
YTD Sum for everything this year
This is how the query currently does this
SELECT TIME_PERIOD, INDEX, SUM(ITEM)
FROM (SELECT
INDEX ,
(CASE
WHEN ACTIVITY_DAY>=(TO_DATE( :end_day,
'yyyy-mm-dd' )-6)
AND ACTIVITY_DAY<=(TO_DATE( :end_day,
'yyyy-mm-dd' )-0) THEN 'WEEK1'
WHEN ACTIVITY_DAY>=(TO_DATE( :end_day,
'yyyy-mm-dd' )-13)
AND ACTIVITY_DAY<=(TO_DATE( :end_day,
'yyyy-mm-dd' )-7) THEN 'WEEK2'
ELSE NULL
END) AS TIME_PERIOD,
MAX(ITEMS) AS ITEM
FROM
SOURCE
GROUP BY
INDEX ,
DAY
UNION
ALL SELECT
INDEX ,
(CASE
WHEN ACTIVITY_DAY>=TO_DATE( :year||'-01-01',
'yyyy-mm-dd' )
AND ACTIVITY_DAY<=TO_DATE( :year||'-01-31',
'yyyy-mm-dd' ) THEN 'Jan'
WHEN ACTIVITY_DAY>=TO_DATE( :year||'-02-01',
'yyyy-mm-dd' )
AND ACTIVITY_DAY<TO_DATE( :year||'-03-01',
'yyyy-mm-dd' ) THEN 'Feb'
ELSE NULL
END) AS TIME_PERIOD ,
MAX(ITEMS) AS ITEM
FROM
SOURCE
GROUP BY
INDEX ,
DAY
UNION
ALL SELECT
INDEX ,
(CASE
WHEN ACTIVITY_DAY>=TO_DATE( :year||'-01-01',
'yyyy-mm-dd' )
AND ACTIVITY_DAY<=TO_DATE( :end_day,
'yyyy-mm-dd' ) THEN 'YTD'
ELSE NULL
END) AS TIME_PERIOD,
MAX(ITEMS) AS ITEM
FROM
SOURCE
GROUP BY
INDEX ,
DAY)
GROUP BY INDEX, TIME_PERIOD
Is there a better way in Oracle?
I think you are looking for something like this:
with data as
(
select sysdate - floor(dbms_random.value(1,400)) dt, floor(dbms_random.value(1,100)) val
from dual
connect by level <= 100
)
select
time_period,
sum(val) period_sum
from
(
select -- weeks
'Week'||(to_char(sysdate, 'WW') - to_char(dt, 'WW') + 1) time_period,
val,
(to_char(sysdate, 'WW') - to_char(dt, 'WW') + 1) ord
from data
where dt >= trunc(sysdate,'YY')
union all
select -- months
to_char(dt, 'Mon') time_period,
val,
100+to_char(dt,'MM') ord
from data
where dt >= trunc(sysdate,'YY')
union all
select -- months
'YTD' time_period,
val,
200
from data
where dt >= trunc(sysdate,'YY')
)
group by
time_period, ord
order by
ord;
Note that you won't need the WITH block, I was just using it to create some dummy data. The Ord column might be unnecessary for you, I was just using it to order the data in a logical fashion.