How can I group by arbitary time period with SQL - sql

This is similar but not equal to my previous question
That was about how to summarize log-items per day.
I use this SQL.
SELECT
[DateLog] = CONVERT(DATE, LogDate),
[Sum] = COUNT(*)
FROM PerfRow
GROUP BY CONVERT(DATE, LogDate)
ORDER BY [DateLog];
Now I want to improve that to summarize over an arbitary time period.
So instead of sum per day, sum per hour or 5 minutes.
Is this possible ?
I use SQL Server 2008 R2

You can round LogDate using DATEADD and DATEPART and then group by that.
Example (groups by five second intervals):
SELECT
[DateLog] = DATEADD(ms,((DATEPART(ss, LogDate)/5)*5000)-(DATEPART(ss, LogDate)*1000)-DATEPART(ms, LogDate), LogDate),
[Sum] = COUNT(*)
FROM
(
SELECT LogDate = '2013-01-01 00:00:00' UNION ALL
SELECT LogDate = '2013-01-01 00:00:04' UNION ALL
SELECT LogDate = '2013-01-01 00:00:06' UNION ALL
SELECT LogDate = '2013-01-01 00:00:08' UNION ALL
SELECT LogDate = '2013-01-01 00:00:10'
) a
GROUP BY DATEADD(ms,((DATEPART(ss, LogDate)/5)*5000)-(DATEPART(ss, LogDate)*1000)-DATEPART(ms, LogDate), LogDate)

Related

How to merge SQL Select queries?

I have three queries executed consistently:
SELECT TOP 1 max(value) FROM tableA
where site = 18
and (CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
SELECT TOP 1 max(value) FROM tableA
where site = 3
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
SELECT TOP 1 max(value) FROM tableA
where site = 4
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
I want to combine this three queries into one and query sites 18, 3, 4 via one select, but I don't see how. Please advise how to merge this 3 queries into one.
Any help will be appreciated!
You seem to want the maximum value for three different sites on the last day in February that has their data.
If so, this is simpler:
select site_id, max(value)
from (select t.*,
dense_rank() over (partition by site order by tstamp / (1000 * 24 * 60 * 60) desc) as seqnum
from t
where tstamp >= datediff(second, '1970-01-01', '2020-02-01') * 1000 and
tstamp < datediff(second, '1970-01-01', '2020-02-29') * 1000 and
site_id in (18, 3, 4)
) t
where seqnum = 1;
Actually, February in 2020 has 29 days. Perhaps you want the entire month; if so, then use '2020-03-01' for the second comparison.
Note that the manipulations on the date/time values are only on the "constant" side. This allows the query to use an index on tstamp if an appropriate index is available.
You can use the analytical function row_number in your existing query as follows:
Select * from
(SELECT max(value), site,
Row_number() over (partition by site order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) desc) as rn FROM tableA
where site in (4,18,3
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE), site)
Where rn = 1

how to filter database data in sql server by per hour

if i have the data like the picture above, let's say 1 day data per minute, and i want to filter it to be per hour, so i will have 24 data, because i take the data per hour. how to do that
i have tried some queries, like this one using group by, but the result is not like i want. didn't work for grouping the datas.
SELECT myDatetime FROM datates
WHERE myDatetime >= '2020-03-01 05:30:00'
AND myDatetime < DATEADD(DAY,1,'2020-03-01 07:30:00')
group by myDatetime ,DATEPART(hour,myDatetime )
You can use row_number(). For instance, if you want the first value per hour:
select d.*
from (select d.*,
row_number() over (partition by convert(date, myDatetime), datepart(hour, myDatetime) order by mydatetime) as seqnum
from datates d
) d
where seqnum = 1;
If you want to sum the values, use aggregation:
select dateadd(hour, datepart(hour, myDatetime), convert(datetime, convert(date, myDatetime))),
sum(nilai)
from datates d
group by dateadd(hour, datepart(hour, myDatetime), convert(datetime, convert(date, myDatetime)));

SQL Show 0 Values when no values between date range

i have some problems with my sql query.
Im selecting some values with a specific date range.
for example 2018-05-01 to 2018-05-13 this is the output.
SUM CalendarWeek
8 18
5 19
If the user will select now a date between 2018-04-01 and 2018-05-13 i want to show a 0 instead when there are no values.
For example:
SUM CalendarWeek
0 13
0 14
0 15
0 16
0 17
8 18
5 19
My Query:
SELECT SUM(Codes) AS 'Sum', CW FROM(
SELECT Count(*) AS 'Codes', DATEPART(wk, ScanDate) AS 'CW',
FROM [Table]
WHERE CONVERT(date, ScanDate, 102) >= '2018-01-01' AND CONVERT(date, ScanDate, 102) <= '2018-05-13'
GROUP BY ScanDate, DATEPART(wk, ScanDate)
UNION ALL
SELECT Count(*) AS 'Codes', DATEPART(wk, ScanDate) AS 'CW', ScanDate
FROM [Table_Archive]
WHERE CONVERT(date, ScanDate, 102) >= '2018-01-01' AND CONVERT(date, ScanDate, 102) <= '2018-05-13'
GROUP BY ScanDate, DATEPART(wk, ScanDate)) test
GROUP BY CW, ScanDate
ORDER BY CW ASC
any ideas how to solve this?
Thanks
First, you have to maintain calender table for such kind of task. If, so then use them. else would need to use recursive cte
declare #stardate date, #enddate date
set #stardate = '2018-04-01'
set #enddate = '2018-05-13'
with t as (
select DATEPART(ISO_WEEK, #stardate) as CalendarWeek
union all
select CalendarWeek+1
from t
where CalendarWeek < DATEPART(ISO_WEEK, #enddate)
)
select t1.sum, coalesce(t.CalendarWeek, 0) CalendarWeek
from t
left join table t1 on t1.CalendarWeek = t.CalendarWeek
SELECT ISNULL(SUM, 0), CalendarWeek
You can use IF in MySQL:
Select IF(sum=null, 0,sum) as sum from table_name
If you don't find solution now from this you can comment again I will give the exact query

SQL - How to find missing activity days using start_date and end_date

I have a few fields in a database that look like this:
trip_id
start_date
end_date
start_station_name
end_station_name
I need to write a query that shows all the stations with no activity on a particular day in the year 2015. I wrote the following query but it's not giving the right output:
select
start_station_name,
extract(date from start_date) as dt,
count(*)
from
trips_table
where
(
start_date >= timestamp('2015-01-01')
and
start_date < timestamp('2016-01-01')
)
group by
start_station_name,
dt
order by
count(*)
Can someone help come up with the right query? Thanks in advance!
Below is for BigQuery Standard SQL
It assumes start_date and end_date are of DATE type
It also assumes that all days in between start_date and end_date are "dedicated" to station in start_station_name field, which most likely not what is expected but question is missing details here thus such an assumption
#standardSQL
WITH days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
You can test/play it with below simple/dummy data
#standardSQL
WITH `trips_table` AS (
SELECT 1 AS trip_id, DATE '2015-01-01' AS start_date, DATE '2015-12-01' AS end_date, '111' AS start_station_name UNION ALL
SELECT 2, DATE '2015-12-10', DATE '2015-12-31', '111'
),
days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
ORDER BY station, day
the output is like below
station day
111 2015-12-02
111 2015-12-03
111 2015-12-04
111 2015-12-05
111 2015-12-06
111 2015-12-07
111 2015-12-08
111 2015-12-09
Use recursion for this purpose: try this SQL SERVER
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
Where CAST(sample.dt as date) NOT IN (
SELECT CAST(start_date as date)
FROM tablename
WHERE start_date >= '2015-01-01 00:00:00'
AND start_date < '2016-01-01 00:00:00'
)
Option(maxrecursion 0)
If you want the station data with it then you can use left join as :
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
left join tablename
on CAST(sample.dt as date) = CAST(tablename.start_date as date)
where sample.dt>= '2015-01-01 00:00:00' and sample.dt< '2016-01-01 00:00:00' )
Option(maxrecursion 0)
For mysql, see this fiddle. I think this would help you....
SQL Fiddle Demo

Find the busiest period

I have a large table in MS SQL 2012 (40m records) containing call data. I would like to find the peak volume of calls, and the time that it occurred. If possible, I would also like to find the next 4 busiest periods.
I plan to use 3 columns:
CallID
DialTime
EndTime
The only way I can think to do this would be to do this:
Select '2013-07-01 00:00:01' as [Period], count([CallID]) as [Calls]
from [Table]
where DialTime <= '2013-07-01 00:00:01'
and EndTime >= '2013-07-01 00:00:01'
union
Select '2013-07-01 00:00:02' as [Period], count([CallID]) as [Calls]
from [Table]
where DialTime <= '2013-07-01 00:00:02'
and EndTime >= '2013-07-01 00:00:02'
union
etc
Can anyone suggest a better/more efficient way of doing this?
Try something like this. #time_begin and #time_end are the parameters that you can use for the interval of time for which you want to get the results.
with time_items (time_item) as
(
select #time_begin as time_item
union all
select dateadd(second,1,t.time_item) as time_item from time_items t where t.time_item<#time_end
)
select
time_items.time_item as [Period],
sum(case when [Table].DialTime<=time_items.time_item and [Table].EndTime>=time_items.time_item then 1 else 0 end) as [Calls]
from time_items
left outer join [Table] on 1=1
group by
time_items.time_item
order by
[Calls] desc;
You can use VALUES as Table Source
SELECT DialTime, EndTime, o.Calls
FROM (VALUES ('20130701 00:00:01', '20130701 00:00:01'),
('20130701 00:00:02', '20130701 00:00:02'))x(DialTime, EndTime)
CROSS APPLY(
SELECT COUNT(CallID) AS Calls
FROM [Table] t
WHERE DialTime <= x.DialTime
AND EndTime >= x.EndTime
) o