Find the busiest period - sql

I have a large table in MS SQL 2012 (40m records) containing call data. I would like to find the peak volume of calls, and the time that it occurred. If possible, I would also like to find the next 4 busiest periods.
I plan to use 3 columns:
CallID
DialTime
EndTime
The only way I can think to do this would be to do this:
Select '2013-07-01 00:00:01' as [Period], count([CallID]) as [Calls]
from [Table]
where DialTime <= '2013-07-01 00:00:01'
and EndTime >= '2013-07-01 00:00:01'
union
Select '2013-07-01 00:00:02' as [Period], count([CallID]) as [Calls]
from [Table]
where DialTime <= '2013-07-01 00:00:02'
and EndTime >= '2013-07-01 00:00:02'
union
etc
Can anyone suggest a better/more efficient way of doing this?

Try something like this. #time_begin and #time_end are the parameters that you can use for the interval of time for which you want to get the results.
with time_items (time_item) as
(
select #time_begin as time_item
union all
select dateadd(second,1,t.time_item) as time_item from time_items t where t.time_item<#time_end
)
select
time_items.time_item as [Period],
sum(case when [Table].DialTime<=time_items.time_item and [Table].EndTime>=time_items.time_item then 1 else 0 end) as [Calls]
from time_items
left outer join [Table] on 1=1
group by
time_items.time_item
order by
[Calls] desc;

You can use VALUES as Table Source
SELECT DialTime, EndTime, o.Calls
FROM (VALUES ('20130701 00:00:01', '20130701 00:00:01'),
('20130701 00:00:02', '20130701 00:00:02'))x(DialTime, EndTime)
CROSS APPLY(
SELECT COUNT(CallID) AS Calls
FROM [Table] t
WHERE DialTime <= x.DialTime
AND EndTime >= x.EndTime
) o

Related

How to merge SQL Select queries?

I have three queries executed consistently:
SELECT TOP 1 max(value) FROM tableA
where site = 18
and (CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,t_stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
SELECT TOP 1 max(value) FROM tableA
where site = 3
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
SELECT TOP 1 max(value) FROM tableA
where site = 4
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE)
order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) DESC;
I want to combine this three queries into one and query sites 18, 3, 4 via one select, but I don't see how. Please advise how to merge this 3 queries into one.
Any help will be appreciated!
You seem to want the maximum value for three different sites on the last day in February that has their data.
If so, this is simpler:
select site_id, max(value)
from (select t.*,
dense_rank() over (partition by site order by tstamp / (1000 * 24 * 60 * 60) desc) as seqnum
from t
where tstamp >= datediff(second, '1970-01-01', '2020-02-01') * 1000 and
tstamp < datediff(second, '1970-01-01', '2020-02-29') * 1000 and
site_id in (18, 3, 4)
) t
where seqnum = 1;
Actually, February in 2020 has 29 days. Perhaps you want the entire month; if so, then use '2020-03-01' for the second comparison.
Note that the manipulations on the date/time values are only on the "constant" side. This allows the query to use an index on tstamp if an appropriate index is available.
You can use the analytical function row_number in your existing query as follows:
Select * from
(SELECT max(value), site,
Row_number() over (partition by site order by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) desc) as rn FROM tableA
where site in (4,18,3
and (CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) >= '2017-2-1'
and CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE) <= '2017-2-28')
Group by CAST(DATEADD(s,stamp/1000,'1970-01-01 00:00:00') as DATE), site)
Where rn = 1

How to join two tables with different condition

I'm just wondering how can I join these two queries?
SELECT SUM(TOTAL_SALES) FROM TBL_SALES WHERE LOGIN_DATE >= '2020-01-03' AND LOGOUT_DATE < '2020-01-04'
SELECT SUM(TOTAL_MONEY) FROM TBL_SALES_INFO WHERE LOGIN_DATE >= '2020-01-03' AND LOGOUT_DATE < '2020-01-04'
here's my code but I'm getting the wrong result:
SELECT SUM(A.TOTAL_SALES) AS TOTAL_SALES,
SUM(B.TOTAL_MONEY) AS TOTAL_MONEY
FROM TBL_SALES A RIGHT JOIN
TBL_SALES_INFO B
ON A.SALES_NO= B.SALES_NO
WHERE A.LOGIN_DATE>= '2020-01-01 09:00:00' AND A.LOGOUT_DATE < '2020-01-02 09:00:00' AND
B.LOGIN_DATE>= '2020-01-01 09:00:00' AND B.LOGOUT_DATE < '2020-01-02 09:00:00'
Your current queries are an entirely separate tables. The best I can see would be to just leave them as is, an aggregate in a top level SELECT:
SELECT
(SELECT SUM(TOTAL_SALES) FROM TBL_SALES
WHERE LOGIN_DATE >= '2020-01-03' AND LOGOUT_DATE < '2020-01-04') AS TOTAL_SALES,
(SELECT SUM(TOTAL_MONEY) FROM TBL_SALES_INFO
WHERE LOGIN_DATE >= '2020-01-03' AND LOGOUT_DATE < '2020-01-04') AS TOTAL_MONEY;
You might be over-complicating things with the a. and b. stuff? In addition, rather than a right join maybe an inner join would work better? Then you wouldn't need to sort by timestamps from both tables. You'd only need the total sales and timestamps based on one tables time.
It would help if you posted the results you were getting, even if they were incorrect.

converting time in sql

i am trying to get getting Between 15 minutes and 1 hour. the below SQL is what I have come up with using TIMESTAMPDIFF. How i am getting an error of 'TIMESTAMPDIFF' is not a recongnized built-in function name.
My SQL
SELECT Name, count(*)
FROM [test.database]
where TME between '2018-10-01 00:00:00.000' and '2019-01-31 00:00:00.999'
and TIMESTAMPDIFF(SECOND, date_trunc('SECOND', DT), date_trunc('SECOND', TME)) >= 900
and TIMESTAMPDIFF(SECOND, date_trunc('SECOND', DT), date_trunc('SECOND', TME)) < 3600
group by Name
order by Name
could someone help me to make my SQL work please.
thankS
Presumably, you are using SQL Server and want datediff(). However, I would use dateadd().
The initial date comparisons look quite cumbersome. I don't see why you would want the ending to be up to one second into Jan 31. So I'm guessing you want something like this:
select Name, count(*)
from [test.database]
where TME >= '2018-10-01' and
TME < '2019-01-31' and
TME >= dateadd(SECOND, 900, DT) and
TME <= dateadd(SECOND, 3, DT)
group by Name
order by Name
SELECT Name, count(*)
FROM [test.database]
where (TME between '2018-10-01 00:00:00.000' and '2019-01-31 00:00:00.999')
and (datepart(SECOND,TME) between 15 and 60)
group by Name
order by Name

SQL - How to find missing activity days using start_date and end_date

I have a few fields in a database that look like this:
trip_id
start_date
end_date
start_station_name
end_station_name
I need to write a query that shows all the stations with no activity on a particular day in the year 2015. I wrote the following query but it's not giving the right output:
select
start_station_name,
extract(date from start_date) as dt,
count(*)
from
trips_table
where
(
start_date >= timestamp('2015-01-01')
and
start_date < timestamp('2016-01-01')
)
group by
start_station_name,
dt
order by
count(*)
Can someone help come up with the right query? Thanks in advance!
Below is for BigQuery Standard SQL
It assumes start_date and end_date are of DATE type
It also assumes that all days in between start_date and end_date are "dedicated" to station in start_station_name field, which most likely not what is expected but question is missing details here thus such an assumption
#standardSQL
WITH days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
You can test/play it with below simple/dummy data
#standardSQL
WITH `trips_table` AS (
SELECT 1 AS trip_id, DATE '2015-01-01' AS start_date, DATE '2015-12-01' AS end_date, '111' AS start_station_name UNION ALL
SELECT 2, DATE '2015-12-10', DATE '2015-12-31', '111'
),
days AS (
SELECT day
FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
SELECT DISTINCT start_station_name AS station
FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`,
UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
ORDER BY station, day
the output is like below
station day
111 2015-12-02
111 2015-12-03
111 2015-12-04
111 2015-12-05
111 2015-12-06
111 2015-12-07
111 2015-12-08
111 2015-12-09
Use recursion for this purpose: try this SQL SERVER
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
Where CAST(sample.dt as date) NOT IN (
SELECT CAST(start_date as date)
FROM tablename
WHERE start_date >= '2015-01-01 00:00:00'
AND start_date < '2016-01-01 00:00:00'
)
Option(maxrecursion 0)
If you want the station data with it then you can use left join as :
WITH sample AS (
SELECT CAST('2015-01-01' AS DATETIME) AS dt
UNION ALL
SELECT DATEADD(dd, 1, dt)
FROM sample s
WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
)
SELECT * FROM sample
left join tablename
on CAST(sample.dt as date) = CAST(tablename.start_date as date)
where sample.dt>= '2015-01-01 00:00:00' and sample.dt< '2016-01-01 00:00:00' )
Option(maxrecursion 0)
For mysql, see this fiddle. I think this would help you....
SQL Fiddle Demo

How can I group by arbitary time period with SQL

This is similar but not equal to my previous question
That was about how to summarize log-items per day.
I use this SQL.
SELECT
[DateLog] = CONVERT(DATE, LogDate),
[Sum] = COUNT(*)
FROM PerfRow
GROUP BY CONVERT(DATE, LogDate)
ORDER BY [DateLog];
Now I want to improve that to summarize over an arbitary time period.
So instead of sum per day, sum per hour or 5 minutes.
Is this possible ?
I use SQL Server 2008 R2
You can round LogDate using DATEADD and DATEPART and then group by that.
Example (groups by five second intervals):
SELECT
[DateLog] = DATEADD(ms,((DATEPART(ss, LogDate)/5)*5000)-(DATEPART(ss, LogDate)*1000)-DATEPART(ms, LogDate), LogDate),
[Sum] = COUNT(*)
FROM
(
SELECT LogDate = '2013-01-01 00:00:00' UNION ALL
SELECT LogDate = '2013-01-01 00:00:04' UNION ALL
SELECT LogDate = '2013-01-01 00:00:06' UNION ALL
SELECT LogDate = '2013-01-01 00:00:08' UNION ALL
SELECT LogDate = '2013-01-01 00:00:10'
) a
GROUP BY DATEADD(ms,((DATEPART(ss, LogDate)/5)*5000)-(DATEPART(ss, LogDate)*1000)-DATEPART(ms, LogDate), LogDate)