Count the number of minutes with datediff and substring - sql

I have data like this
availabilities
[{"starts_at":"09:00","ends_at":"17:00"}]
I have query below and it works
select COALESCE(availabilities,'Total') as availabilities,
SUM(DATEDIFF(minute,start_at,end_at)) as 'Total Available Hours in Minutes'
from (
select cast(availabilities as NVARCHAR) as availabilities,
cast(SUBSTRING(availabilities,16,5) as time) as start_at,
cast(SUBSTRING(availabilities,34,5) as time) as end_at
from alfy.dbo.daily_availabilities
)x
GROUP by ROLLUP(availabilities);
Result
availabilities Total Available Hours in Minutes
[{"starts_at":"09:00","ends_at":"17:00"}] 480
How if the data like below
availabilities
[{"starts_at":"10:00","ends_at":"13:30"},{"starts_at":"14:00","ends_at":"18:00"}]
[{"starts_at":"09:00","ends_at":"12:30"},{"starts_at":"13:00","ends_at":"15:30"},{"starts_at":"16:00","ends_at":"18:00"}]
How to count the number of minutes over two or more time ranges?

Since you have JSON data use OPENJSON (Transact-SQL) to parse it, e.g.:
create table dbo.daily_availabilities (
id int,
availabilities nvarchar(max) --JSON
);
insert dbo.daily_availabilities (id, availabilities) values
(1, N'[{"starts_at":"09:00","ends_at":"17:00"}]'),
(2, N'[{"starts_at":"10:00","ends_at":"13:30"},{"starts_at":"14:00","ends_at":"18:00"}]'),
(3, N'[{"starts_at":"09:00","ends_at":"12:30"},{"starts_at":"13:00","ends_at":"15:30"},{"starts_at":"16:00","ends_at":"18:00"}]');
select id, sum(datediff(mi, starts_at, ends_at)) as total_minutes
from dbo.daily_availabilities
cross apply openjson(availabilities) with (
starts_at time,
ends_at time
) av
group by id
id
total_minutes
1
480
2
450
3
480

Related

Divide monthly spend into daily spend in BigQuery

I have monthly data in BigQuery in the following form:
CREATE TABLE if not EXISTS spend (
id int,
created_at DATE,
value float
);
INSERT INTO spend VALUES
(1, '2020-01-01', 100),
(2, '2020-02-01', 200),
(3, '2020-03-01', 100),
(4, '2020-04-01', 100),
(5, '2020-05-01', 50);
I would like a query to translate it into daily data in the following day:
One row per day.
The value of each day should be the monthly value divided by the number of days of the month.
What's the simplest way of doing this in BigQuery?
You can make use of GENERATE_DATE_ARRAY() in order to get an array between the desired dates (in your case, between 2020-01-01 and 2020-05-31) and create a calendar table, and then divide the value of a given month among the days in the month :)
Try this and let me know if it worked:
with calendar_table as (
select
calendar_date
from
unnest(generate_date_array('2020-01-01', '2020-05-31', interval 1 day)) as calendar_date
),
final as (
select
ct.calendar_date,
s.value,
s.value / extract(day from last_day(ct.calendar_date)) as daily_value
from
spend as s
cross join
calendar_table as ct
where
format_date('%Y-%m', date(ct.calendar_date)) = format_date('%Y-%m', date(s.created_at))
)
select * from final
My recommendation is to do this "locally". That is, run generate_date_array() for each row in the original table. This is much faster than a join across rows. BigQuery also makes this easy with the last_day() function:
select t.id, u.date,
t.value / extract(day from last_day(t.created_at))
from `table` t cross join
unnest(generate_date_array(t.created_at,
last_day(t.created_at, month)
)
) u(date);

SQLite group datetime by 168 hours (7 days) gives NULL back

I have an Orders table in my SQLite database. What I want to do is group by the data by every 168 hours (7 days), and count to total Orders per 168 hours.
What I did was create an in memory "calendar table" and I joined my Orders table to that calendar set.
This works fine when I group by 12, 24, 48 or even 120 hours (5 days). But for some reason it doesn't work when I group by 168 hours (7 days). I get NULL values back instead of what count() should really return.
The following sql code is an example that groups by every 120 hours (5 days).
CREATE TABLE IF NOT EXISTS Orders (
Id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
Key TEXT,
Timestamp TEXT NOT NULL
);
INSERT INTO Orders (Key, Timestamp) VALUES ('k1', '2019-10-01 10:00:23');
INSERT INTO Orders (Key, Timestamp) VALUES ('k2', '2019-10-01 15:45:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k3', '2019-10-02 17:05:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k4', '2019-10-03 20:12:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k5', '2019-10-04 08:49:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k6', '2019-10-05 11:24:19');
INSERT INTO Orders (Key, Timestamp) VALUES ('k7', '2019-10-07 11:24:19');
WITH RECURSIVE dates(date1) AS (
VALUES('2019-10-01 00:00:00')
UNION ALL
SELECT datetime(date1, '+120 hours')
FROM dates
WHERE date1 <= '2019-10-29 00:00:00'
)
SELECT date1 as __ddd, d2.* FROM dates AS d1
LEFT JOIN (
SELECT count(Key) AS OrderKey,
datetime((strftime('%s', timestamp) / 432000) * 432000, 'unixepoch') as __interval
FROM `Orders`
WHERE `Timestamp` >= '2019-09-29T00:00:00.000'
GROUP BY __interval LIMIT 10
) d2 ON d1.date1 = d2.__interval
Important note:
If you want to update this code to test it with 168 hours (7 days), then you should do the following:
Change +120 hours to +168 hours
Change 432000 (432000 == 120 hours) to 604800 (604800 == 168 hours)
note that this number occurs twice, both should be replaced
Anyone any idea why it stops working properly when I change the sql code to 168 hours?
Your problem is that when you change to a 7-day interval, the values in your dates CTE don't align with the intervals generated from your Orders table. You can fix that by making the dates CTE start on a similarly aligned date:
WITH RECURSIVE dates(date1) AS (
SELECT datetime((strftime('%s', '2019-10-01 00:00:00') / 604800) * 604800, 'unixepoch')
UNION ALL
SELECT datetime(date1, '+168 hours')
FROM dates
WHERE date1 <= '2019-10-29 00:00:00'
)
Output:
__ddd OrderKey __interval
2019-09-26 00:00:00 3 2019-09-26 00:00:00
2019-10-03 00:00:00 4 2019-10-03 00:00:00
2019-10-10 00:00:00 null null
2019-10-17 00:00:00 null null
2019-10-24 00:00:00 null null
2019-10-31 00:00:00 null null
Demo on dbfiddle

Irregular grouping of timestamp variable

I have a table organized as follows:
id lateAt
1231235 2019/09/14
1242123 2019/09/13
3465345 NULL
5676548 2019/09/28
8986475 2019/09/23
Where lateAt is a timestamp of when a certain loan's payment became late. So, for each current date - I need to look at these numbers daily - there's a certain amount of entries which are late for 0-15, 15-30, 30-45, 45-60, 60-90 and 90+ days.
This is my desired output:
lateGroup Count
0-15 20
15-30 22
30-45 25
45-60 32
60-90 47
90+ 57
This is something I can easily calculate in R, but to get the results back to my BI dashboard I'd have to create a new table in my database, which I don't think is a good practice. What is the SQL-native approach to this problem?
I would define the "late groups" using a range, the join against the number of days:
with groups (grp) as (
values
(int4range(0,15, '[)')),
(int4range(15,30, '[)')),
(int4range(30,45, '[)')),
(int4range(45,60, '[)')),
(int4range(60,90, '[)')),
(int4range(90,null, '[)'))
)
select grp, count(t.user_id)
from groups g
left join the_table t on g.grp #> current_date - t.late_at
group by grp
order by grp;
int4range(0,15, '[)') creates a range from 0 (inclusive) and 15 (exclusive)
Online example: https://rextester.com/QJSN89445
The quick and dirty way to do this in SQL is:
SELECT '0-15' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 0
AND (CURRENT_DATE - t.lateAt) < 15
UNION
SELECT '15-30' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 15
AND (CURRENT_DATE - t.lateAt) < 30
UNION
SELECT '30-45' AS lateGroup,
COUNT(*) AS lateGroupCount
FROM my_table t
WHERE (CURRENT_DATE - t.lateAt) >= 30
AND (CURRENT_DATE - t.lateAt) < 45
-- Etc...
For production code, you would want to do something more like Ross' answer.
You didn't mention which DBMS you're using, but nearly all of them will have a construct known as a "value constructor" like this:
select bins.lateGroup, bins.minVal, bins.maxVal FROM
(VALUES
('0-15',0,15),
('15-30',15.0001,30), -- increase by a small fraction so bins don't overlap
('30-45',30.0001,45),
('45-60',45.0001,60),
('60-90',60.0001,90),
('90-99999',90.0001,99999)
) AS bins(lateGroup,minVal,maxVal)
If your DBMS doesn't have it, then you can probably use UNION ALL:
SELECT '0-15' as lateGroup, 0 as minVal, 15 as maxVal
union all SELECT '15-30',15,30
union all SELECT '30-45',30,45
Then your complete query, with the sample data you provided, would look like this:
--- example from SQL Server 2012 SP1
--- first let's set up some sample data
create table #temp (id int, lateAt datetime);
INSERT #temp (id, lateAt) values
(1231235,'2019-09-14'),
(1242123,'2019-09-13'),
(3465345,NULL),
(5676548,'2019-09-28'),
(8986475,'2019-09-23');
--- here's the actual query
select lateGroup, count(*) as Count
from #temp as T,
(VALUES
('0-15',0,15),
('15-30',15.0001,30), -- increase by a small fraction so bins don't overlap
('30-45',30.0001,45),
('45-60',45.0001,60),
('60-90',60.0001,90),
('90-99999',90.0001,99999)
) AS bins(lateGroup,minVal,maxVal)
) AS bins(lateGroup,minVal,maxVal)
where datediff(day,lateAt,getdate()) between minVal and maxVal
group by lateGroup
order by lateGroup
--- remove our sample data
drop table #temp;
Here's the output:
lateGroup Count
15-30 2
30-45 2
Note: rows with null lateAt are not counted.
I think you can do it all in one clear query :
with cte_lategroup as
(
select *
from (values(0,15,'0-15'),(15,30,'15-30'),(30,45,'30-45')) as t (mini, maxi, designation)
)
select
t2.designation
, count(*)
from test t
left outer join cte_lategroup t2
on current_date - t.lateat >= t2.mini
and current_date - lateat < t2.maxi
group by t2.designation;
With a preset like yours :
create table test
(
id int
, lateAt date
);
insert into test
values (1231235, to_date('2019/09/14', 'yyyy/mm/dd'))
,(1242123, to_date('2019/09/13', 'yyyy/mm/dd'))
,(3465345, null)
,(5676548, to_date('2019/09/28', 'yyyy/mm/dd'))
,(8986475, to_date('2019/09/23', 'yyyy/mm/dd'));

SQL - How to ignore seconds and round down minutes in DateTime data type

At work we did a project that required a team to count students 8 times a day over 5 days at specific time periods. They are, as follows :-
09:00, 10:00, 11:00, 13:15, 14:15, 14:50, 15:50, 16:20.
Now, the data collected was put directly into a database via a web app. The problem is that database recorded each record using the standard YYYY-MM-DD HH:MM:SS.MIL, but if I were to order the records by date and then by student count it would cause the following problem;
e.g.:-
if the students counted in a room was 5 at 09:00:12, but another room had a count of 0 at 09:02:20 and I did the following:
select student_count, audit_date
from table_name
order by audit_date, student_count;
The query will return:
5 09:00:12
0 09:02:20
but I want:
0 09:00:00
5 09:00:00
because we're looking for the number of students in each room for the period 09:00, but unfortunately to collect the data it required us to do so within that hour and obviously the database will pick up on that accuracy. Furthermore, this issue becomes more problematic when it gets to the periods 14:15 and 14:50, where we will need to be able to distinguish between the two periods.
Is there a way to ignore the seconds part of the DateTime, and the round the minutes down to the nearest ten minute?
I'm using SQL Server Management Studio 2012. If none of this made sense, I'm sorry!
You may want some sort of Period table to store your segments. Then you can use that to join to your counts table.
CREATE TABLE [Periods]
( -- maybe [id] INT,
[start_time] TIME,
[end_time] TIME
);
INSERT INTO [Periods]
VALUES ('09:00','10:00'),
('10:00','11:00'),
('11:00','13:15'),
('13:15','14:15'),
('14:15','14:50'),
('14:50','15:50'),
('15:50','16:20'),
('16:20','17:00')
SELECT
student_count, [start_time]
FROM table_name A
INNER JOIN [Periods] B
ON CAST(A.[audit_date] AS TIME) >= B.[start_time]
AND CAST(A.[audit_date] AS TIME) < B.[end_time]
You can use the DATEADDand DATEPARTfunctions to accomplish this together with a CASEexpression. If you want more precise cutoffs between the .14and .50periods you can easily adjust the case statement and also if you want to minutes to be .10or.15
-- some test data
declare #t table (the_time time)
insert #t values ('09:00:12')
insert #t values ('14:16:12')
insert #t values ('09:02:12')
insert #t values ('14:22:12')
insert #t values ('15:49:12')
insert #t values ('15:50:08')
select
the_time,
case
when datepart(minute,the_time) < 15 then
dateadd(second, -datepart(second,the_time),dateadd(minute, -datepart(minute,the_time),the_time))
when datepart(minute,the_time) >= 15 and datepart(minute,the_time) < 50 then
dateadd(second, -datepart(second,the_time),dateadd(minute, -datepart(minute,the_time)+10,the_time))
else
dateadd(second, -datepart(second,the_time),dateadd(minute, -datepart(minute,the_time)+50,the_time))
end as WithoutSeconds
from #t
Results:
the_time WithoutSeconds
---------------- ----------------
09:00:12.0000000 09:00:00.0000000
14:16:12.0000000 14:10:00.0000000
09:02:12.0000000 09:00:00.0000000
14:22:12.0000000 14:10:00.0000000
15:49:12.0000000 15:10:00.0000000
15:50:08.0000000 15:50:00.0000000
Try this:
SELECT
CAST(
DATEADD(SECOND, - (CONVERT(INT, RIGHT(CONVERT(CHAR(2),
DATEPART(MINUTE, GETDATE())),1))*60) - (DATEPART(SECOND,GETDATE())), GETDATE())
AS SMALLDATETIME);
You can try ORDER BY this formula
DATEADD(minute, floor((DATEDIFF(minute, '20000101', audit_date) + 5)/10)*10, '20000101')
e.g.
WITH tbl AS(
SELECT * FROM ( VALUES (5,'2014-03-28 09:00:09.793'),(0,'2014-03-28 09:02:20.123')) a (student_count, audit_date)
)
SELECT *,DATEADD(minute, floor((DATEDIFF(minute, '20000101', audit_date) + 5)/10)*10, '20000101') as ORDER_DT
FROM tbl
ORDER BY ORDER_DT,student_count
SQL Fiddle

Calculate Sum of times

Hi I'm creating an RDLC Report for an attendance management system. In my report I want to Sum of total WorkedHours.
I Want exactly this output
EmployeeId EmployeeName WorkedHours
1 ABC 04:00:25
2 XYZ 07:23:01
3 PQR 11:02:15
SO i want to display total of all 3 employees at the end of report in RDLC.
like Total: 22:25:41
Try this
select EmpId,In_duration,out_duration,
datediff(mi,in_duration,out_duration) as Diff_mins
from table
the first parameter in DateDiff is mi (minutes). If you look up dateDiff in documentation, you'll see the other options. The example below shows difference in hours, minutes, and seconds as a string.
select
LTRIM(STR(DATEDIFF(mi,in_duration,out_duration) /60))+':'+
LTRIM(STR(DATEDIFF(mi,in_duration,out_duration) %60))+':'+
LTRIM(STR(DATEDIFF(ss,in_duration,out_duration) -
(DATEDIFF(ss,in_duration,out_duration)/60)*60))
as diff
declare #T table
(
EmployeeId int,
EmployeeName varchar(10),
WorkedHours varchar(8)
)
insert into #T
select 1, 'ABC', '04:00:02' union all
select 2, 'XYZ', '07:23:01' union all
select 3, 'PQR', '11:02:15'
select right(100+T2.hh, 2)+':'+right(100+T2.mm, 2)+':'+right(100+T2.ss, 2) as TotalHours
from (
select dateadd(second, sum(datediff(second, 0, cast(WorkedHours as datetime))), 0) as WorkedSum
from #T
) as T1
cross apply (select datediff(hour, 0, T1.WorkedSum) as hh,
datepart(minute, T1.WorkedSum) as mm,
datepart(second, T1.WorkedSum) as ss) as T2
Convert WorkedHours to datetime
Calculate the number of seconds since 1900-01-01T00:00:00
Sum the seconds for all rows
Convert seconds to datetime
Use datediff to calculate the number of hours
Use datepart to get minutes and seconds
Build the resulting string and use right(100+T...) to add 0 before value if neccesary.
The cross apply ... part is not necessary. I added that to make the code clearer. You can use the expressions in the cross apply directly in the field list if you like.