Processing data set into 30 minute values - sql

I have a data set in the following format -
ID START_TIME END_TIME VAL
1 30-APR-2018 00:00:00 01-MAY-2018 00:00:00 423
2 01-MAY-2018 00:00:00 01-MAY-2018 17:15:00 455
3 01-MAY-2018 17:15:00 03-MAY-2018 00:00:00 455
Expected Output -
This data set should be broken down into 30 min interval values, however if there are records which are not at '00' or '30' minute point then they should be considered as part of this process (as shown for record with START_TIME/END_TIME = '17:15:00')
ID START_TIME END_TIME VAL
1 30-APR-2018 00:00:00 30-APR-2018 00:30:00 423
1 30-APR-2018 00:30:00 30-APR-2018 01:00:00 423
1 30-APR-2018 01:00:00 30-APR-2018 01:30:00 423
..
..
..
1 30-APR-2018 23:00:00 30-APR-2018 23:30:00 423
1 30-APR-2018 23:30:00 01-MAY-2018 00:00:00 423
2 01-MAY-2018 00:00:00 01-MAY-2018 00:30:00 455
2 01-MAY-2018 00:30:00 01-MAY-2018 01:00:00 455
..
..
..
..
2 01-MAY-2018 16:30:00 01-MAY-2018 17:00:00 455
2 01-MAY-2018 17:00:00 01-MAY-2018 17:15:00 455
3 01-MAY-2018 17:15:00 03-MAY-2018 17:30:00 455
3 01-MAY-2018 17:30:00 03-MAY-2018 18:00:00 455
..
..
..
3 02-MAY-2018 23:00:00 02-MAY-2018 23:30:00 455
3 02-MAY-2018 23:30:00 03-MAY-2018 00:00:00 455
What I have tried so far -
CREATE TABLE TESTT
(
ID NUMBER(8,3),
START_TIME DATE,
END_TIME DATE,
VAL NUMBER(8,3)
);
INSERT INTO TESTT VALUES (1, TO_DATE('30-APR-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), TO_DATE('01-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), 423);
INSERT INTO TESTT VALUES (2, TO_DATE('01-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), TO_DATE('01-MAY-2018 17:15:00','DD-MON-YYYY HH24:MI:SS'), 455);
INSERT INTO TESTT VALUES (3, TO_DATE('01-MAY-2018 17:15:00','DD-MON-YYYY HH24:MI:SS'), TO_DATE('03-MAY-2018 00:00:00','DD-MON-YYYY HH24:MI:SS'), 455);
COMMIT;
CREATE TABLE TESTT_OUTPUT AS
SELECT * FROM TESTT WHERE 1=2;
CREATE SEQUENCE TESTT_SEQ MINVALUE 1 MAXVALUE 9999999999999999999999999999 INCREMENT BY 1 START WITH 1 NOCACHE NOORDER NOCYCLE NOPARTITION;
BEGIN
FOR R IN (SELECT * FROM TESTT)
LOOP
INSERT INTO TESTT_OUTPUT(id, START_TIME, END_TIME, VAL)
SELECT TESTT_SEQ.nextval, R.START_TIME + (LEVEL - 1)/48 AS START_TIME, R.START_TIME + LEVEL/48 AS END_TIME, R.VAL FROM
DUAL
CONNECT BY LEVEL <= ROUND((R.END_TIME - R.START_TIME)*48);
COMMIT;
END LOOP;
END;
/
SELECT * FROM TESTT_OUTPUT;
1 30-APR-2018 00:00:00 30-APR-2018 00:30:00 423
2 30-APR-2018 00:30:00 30-APR-2018 01:00:00 423
3 30-APR-2018 01:00:00 30-APR-2018 01:30:00 423
..
..
..
47 30-APR-2018 23:00:00 30-APR-2018 23:30:00 423
48 30-APR-2018 23:30:00 01-MAY-2018 00:00:00 423
49 01-MAY-2018 00:00:00 01-MAY-2018 00:30:00 455
50 01-MAY-2018 00:30:00 01-MAY-2018 01:00:00 455
..
..
..
82 01-MAY-2018 16:30:00 01-MAY-2018 17:00:00 455
83 01-MAY-2018 17:00:00 01-MAY-2018 17:30:00 455
84 01-MAY-2018 17:15:00 01-MAY-2018 17:45:00 455
85 01-MAY-2018 17:45:00 01-MAY-2018 18:15:00 455
86 01-MAY-2018 18:15:00 01-MAY-2018 18:45:00 455
87 01-MAY-2018 18:45:00 01-MAY-2018 19:15:00 455
..
..
..
141 02-MAY-2018 21:45:00 02-MAY-2018 22:15:00 455
142 02-MAY-2018 22:15:00 02-MAY-2018 22:45:00 455
143 02-MAY-2018 22:45:00 02-MAY-2018 23:15:00 455
144 02-MAY-2018 23:15:00 02-MAY-2018 23:45:00 455
145 02-MAY-2018 23:45:00 03-MAY-2018 00:15:00 455
With this approach any data with the minute value other than '00' or '30' will still be processed the same way by adding 30 mins to it and the final result does not have the point in time data for '00' or '30' minute value.
Hope this makes sense.
Any inputs on how to translate the data in the expected format will be extremely helpful. Thanks!

It seems rather inelegant, but this;
select id,
greatest(start_time,
adj_start_time + numtodsinterval(30 * (level - 1), 'MINUTE')) as start_time,
least(end_time,
adj_start_time + numtodsinterval(30 * level, 'MINUTE')) as end_time
from (
select id,
start_time,
end_time,
trunc(start_time, 'HH')
+ numtodsinterval(
case when extract(minute from cast(start_time as timestamp)) < 30 then 0
else 30
end, 'MINUTE') as adj_start_time
from testt
)
connect by level <= ceil((end_time - start_time - 1/86400) / (30/1440))
and prior id = id
and prior dbms_random.value is not null
order by id, start_time;
seems to get the result you want, generating 145 rows:
ID START_TIME END_TIME
---------- ------------------- -------------------
1 2018-04-30 00:00:00 2018-04-30 00:30:00
1 2018-04-30 00:30:00 2018-04-30 01:00:00
1 2018-04-30 01:00:00 2018-04-30 01:30:00
...
1 2018-04-30 22:30:00 2018-04-30 23:00:00
1 2018-04-30 23:00:00 2018-04-30 23:30:00
1 2018-04-30 23:30:00 2018-05-01 00:00:00
2 2018-05-01 00:00:00 2018-05-01 00:30:00
2 2018-05-01 00:30:00 2018-05-01 01:00:00
2 2018-05-01 01:00:00 2018-05-01 01:30:00
...
2 2018-05-01 16:00:00 2018-05-01 16:30:00
2 2018-05-01 16:30:00 2018-05-01 17:00:00
2 2018-05-01 17:00:00 2018-05-01 17:15:00
3 2018-05-01 17:15:00 2018-05-01 17:30:00
3 2018-05-01 17:30:00 2018-05-01 18:00:00
3 2018-05-01 18:00:00 2018-05-01 18:30:00
...
3 2018-05-02 22:30:00 2018-05-02 23:00:00
3 2018-05-02 23:00:00 2018-05-02 23:30:00
3 2018-05-02 23:30:00 2018-05-03 00:00:00
The inline view gets the real columns plus the nominal 30-minute window for the start -i.e., for 17:15 it gets 17:00, as adj_start_time. The hierarchical query adds 30-minute intervals to that, and uses least and greatest to get the original start/end time if they are not exactly on the half-hour.
For your insert you can replace the original ID with an analytic row_number() rather than using a sequence, and include the val:
insert into testt_output(id, start_time, end_time, val)
select row_number() over (order by id, level),
greatest(start_time,
adj_start_time + numtodsinterval(30 * (level - 1), 'MINUTE')) as start_time,
least(end_time,
adj_start_time + numtodsinterval(30 * level, 'MINUTE')) as end_time,
val
from (
select id,
start_time,
end_time,
val,
trunc(start_time, 'HH')
+ numtodsinterval(
case when extract(minute from cast(start_time as timestamp)) < 30 then 0
else 30
end, 'MINUTE') as adj_start_time
from testt
)
connect by level <= ceil((end_time - start_time - 1/86400) / (30/1440))
and prior id = id
and prior dbms_random.value is not null;
145 rows inserted.
select * from testt_output;
ID START_TIME END_TIME VAL
---------- ------------------- ------------------- ----------
1 2018-04-30 00:00:00 2018-04-30 00:30:00 423
2 2018-04-30 00:30:00 2018-04-30 01:00:00 423
...
47 2018-04-30 23:00:00 2018-04-30 23:30:00 423
48 2018-04-30 23:30:00 2018-05-01 00:00:00 423
49 2018-05-01 00:00:00 2018-05-01 00:30:00 455
50 2018-05-01 00:30:00 2018-05-01 01:00:00 455
...
82 2018-05-01 16:30:00 2018-05-01 17:00:00 455
83 2018-05-01 17:00:00 2018-05-01 17:15:00 455
84 2018-05-01 17:15:00 2018-05-01 17:30:00 455
85 2018-05-01 17:30:00 2018-05-01 18:00:00 455
...
144 2018-05-02 23:00:00 2018-05-02 23:30:00 455
145 2018-05-02 23:30:00 2018-05-03 00:00:00 455
db<>fiddle demo.

Related

resample dataset with one irregular datetime

I have a dataframe like the following. I wanted to check the values for each 15minutes. But I see that there is a time at 09:05:51. How can I resample the dataframe for 15minutes?
hour_min value
06:30:00 0.0
06:45:00 0.0
07:00:00 0.0
07:15:00 0.0
07:30:00 102.754717
07:45:00 130.599057
08:00:00 154.117925
08:15:00 189.061321
08:30:00 214.924528
08:45:00 221.382075
09:00:00 190.839623
09:05:51 428.0
09:15:00 170.973995
09:30:00 0.0
09:45:00 0.0
10:00:00 174.448113
10:15:00 174.900943
10:30:00 182.976415
10:45:00 195.783019
11:00:00 200.337292
11:14:00 80.0
11:15:00 206.280952
11:30:00 218.87886
11:45:00 238.251781
12:00:00 115.5
12:15:00 85.5
12:30:00 130.0
12:45:00 141.0
13:00:00 267.353774
13:15:00 257.061321
13:21:00 8.0
13:27:19 80.0
13:30:00 258.761905
13:45:00 254.703088
13:53:52 278.0
14:00:00 254.790476
14:15:00 247.165094
14:30:00 250.061321
14:45:00 264.014151
15:00:00 132.0
15:15:00 108.0
15:30:00 158.5
15:45:00 457.0
16:00:00 273.745283
16:15:00 273.962264
16:30:00 279.089623
16:45:00 280.264151
17:00:00 296.061321
17:15:00 296.481132
17:30:00 282.957547
17:45:00 279.816038
I have tried this line, but i get a typeError.
res = s.resample('15T').sum()
I tried to make the index to date, but it does not work too.

Why do I get different values when I extract data from netCDF files using CDO and ArcGIS for a same grid point?

details of the raw data (Mnth.nc)
netcdf Mnth {
dimensions:
time = UNLIMITED ; // (480 currently)
bnds = 2 ;
longitude = 25 ;
latitude = 33 ;
variables:
double time(time) ;
time:standard_name = "time" ;
time:long_name = "verification time generated by wgrib2 function verftime()" ;
time:bounds = "time_bnds" ;
time:units = "seconds since 1970-01-01 00:00:00.0 0:00" ;
time:calendar = "standard" ;
time:axis = "T" ;
double time_bnds(time, bnds) ;
double longitude(longitude) ;
longitude:standard_name = "longitude" ;
longitude:long_name = "longitude" ;
longitude:units = "degrees_east" ;
longitude:axis = "X" ;
double latitude(latitude) ;
latitude:standard_name = "latitude" ;
latitude:long_name = "latitude" ;
latitude:units = "degrees_north" ;
latitude:axis = "Y" ;
float APCP_sfc(time, latitude, longitude) ;
APCP_sfc:long_name = "Total Precipitation" ;
APCP_sfc:units = "kg/m^2" ;
APCP_sfc:_FillValue = 9.999e+20f ;
APCP_sfc:missing_value = 9.999e+20f ;
APCP_sfc:cell_methods = "time: sum" ;
APCP_sfc:short_name = "APCP_surface" ;
APCP_sfc:level = "surface" ;
}
Detail information of the raw data (Mnth.nc)
File format : NetCDF4 classic
-1 : Institut Source T Steptype Levels Num Points Num Dtype : Parameter ID
1 : unknown unknown v instant 1 1 825 1 F32 : -1
Grid coordinates :
1 : lonlat : points=825 (25x33)
longitude : 87 to 89.88 by 0.12 degrees_east
latitude : 25.08 to 28.92 by 0.12 degrees_north
Vertical coordinates :
1 : surface : levels=1
Time coordinate : 480 steps
RefTime = 1970-01-01 00:00:00 Units = seconds Calendar = standard Bounds = true
YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss
1980-01-16 12:30:00 1980-02-15 12:30:00 1980-03-16 12:30:00 1980-04-16 00:30:00
1980-05-16 12:30:00 1980-06-16 00:30:00 1980-07-16 12:30:00 1980-08-16 12:30:00
1980-09-16 00:30:00 1980-10-16 12:30:00 1980-11-16 00:30:00 1980-12-16 12:30:00
1981-01-16 12:30:00 1981-02-15 00:30:00 1981-03-16 12:30:00 1981-04-16 00:30:00
1981-05-16 12:30:00 1981-06-16 00:30:00 1981-07-16 12:30:00 1981-08-16 12:30:00
1981-09-16 00:30:00 1981-10-16 12:30:00 1981-11-16 00:30:00 1981-12-16 12:30:00
1982-01-16 12:30:00 1982-02-15 00:30:00 1982-03-16 12:30:00 1982-04-16 00:30:00
1982-05-16 12:30:00 1982-06-16 00:30:00 1982-07-16 12:30:00 1982-08-16 12:30:00
1982-09-16 00:30:00 1982-10-16 12:30:00 1982-11-16 00:30:00 1982-12-16 12:30:00
1983-01-16 12:30:00 1983-02-15 00:30:00 1983-03-16 12:30:00 1983-04-16 00:30:00
1983-05-16 12:30:00 1983-06-16 00:30:00 1983-07-16 12:30:00 1983-08-16 12:30:00
1983-09-16 00:30:00 1983-10-16 12:30:00 1983-11-16 00:30:00 1983-12-16 12:30:00
1984-01-16 12:30:00 1984-02-15 12:30:00 1984-03-16 12:30:00 1984-04-16 00:30:00
1984-05-16 12:30:00 1984-06-16 00:30:00 1984-07-16 12:30:00 1984-08-16 12:30:00
1984-09-16 00:30:00 1984-10-16 12:30:00 1984-11-16 00:30:00 1984-12-16 12:30:00
................................................................................
............................
2016-01-16 12:30:00 2016-02-15 12:30:00 2016-03-16 12:30:00 2016-04-16 00:30:00
2016-05-16 12:30:00 2016-06-16 00:30:00 2016-07-16 12:30:00 2016-08-16 12:30:00
2016-09-16 00:30:00 2016-10-16 12:30:00 2016-11-16 00:30:00 2016-12-16 12:30:00
2017-01-16 12:30:00 2017-02-15 00:30:00 2017-03-16 12:30:00 2017-04-16 00:30:00
2017-05-16 12:30:00 2017-06-16 00:30:00 2017-07-16 12:30:00 2017-08-16 12:30:00
2017-09-16 00:30:00 2017-10-16 12:30:00 2017-11-16 00:30:00 2017-12-16 12:30:00
2018-01-16 12:30:00 2018-02-15 00:30:00 2018-03-16 12:30:00 2018-04-16 00:30:00
2018-05-16 12:30:00 2018-06-16 00:30:00 2018-07-16 12:30:00 2018-08-16 12:30:00
2018-09-16 00:30:00 2018-10-16 12:30:00 2018-11-16 00:30:00 2018-12-16 12:30:00
2019-01-16 12:30:00 2019-02-15 00:30:00 2019-03-16 12:30:00 2019-04-16 00:30:00
2019-05-16 12:30:00 2019-06-16 00:30:00 2019-07-16 12:30:00 2019-08-16 12:30:00
2019-09-16 00:30:00 2019-10-16 12:30:00 2019-11-16 00:30:00 2019-12-16 12:30:00
2020-01-16 12:30:00 2020-02-15 12:30:00 2020-03-16 12:30:00 2020-04-16 00:30:00
2020-05-16 12:30:00 2020-06-16 00:30:00 2020-07-16 12:30:00 2020-08-16 12:30:00
2020-09-16 00:30:00 2020-10-16 12:30:00 2020-11-16 00:30:00 2020-12-16 12:30:00
cdo sinfo: Processed 1 variable over 480 timesteps [0.50s 30MB].
I extracted monthly rainfall values from the Mnth.nc file for a location (lon: 88.44; lat: 27.12)using the following command
cdo remapnn,lon=88.44-lat=27.12 Mnth.nc Mnth1.nc
cdo outputtab,year, month, value Mnth1.nc > Mnth.csv
The output is as follows ()
Year month Value
1980 1 31.74219
1980 2 54.60938
1980 3 66.94531
1980 4 149.4062
1980 5 580.7227
1980 6 690.1328
1980 7 1146.305
1980 8 535.8164
1980 9 486.4688
1980 10 119.5391
1980 11 82.10547
1980 12 13.95703
Then I extracted the rainfall values from the same data (Mnth.nc) for the same location (lon: 88.44; lat: 27.12) using the features of the multidimensional toolbox provided in ArcGIS. The result is as follows-
year month Value
1980 1 38.8125
1980 2 58.6542969
1980 3 71.7382813
1980 4 148.6367188
1980 5 564.7070313
1980 6 653.0390625
1980 7 1026.832031
1980 8 501.3164063
1980 9 458.5429688
1980 10 113.078125
1980 11 74.0976563
1980 12 24.2265625
Why I'm getting different results in two different software for the same location and for the same variable? Any help will highly be appreciated.
Thanks in advance.
The question is perhaps misleading, in that you are not "extracting" the data in both cases. Instead you are interpolating it. The method used by CDO is nearest neighbour. arcGIS is probably simply using a different method, so you should get different results. They should give slightly different results.
The results look very similar, so both are almost certainly working as advertised.
I think I ended up in the same issues. I used CDO to extract a point and also used ArcGIS for cross checking. I found out that the values were different.
Just to be sure, I recorded the location extent of one particular cell and tried extracting values for different locations within the cell boundary extent. CDO seemed to have been giving the same results as expected because it uses nearest neighbour resampling method.
Then I tried the same with ArcGIS. Interestingly, in my case, I found out that ArcGIS also gave me same results sometimes within the same cell boundary extent and sometimes different. I checked the values by also using 'Panoply' and I realised that CDO gave accurate results, while ArcGIS was sometimes giving offset results,i.e., it was giving the values of the nearby cells. This was confirmed by cross-checking with Panoply. As #Robert Wilson mentioned that ArcGIS must be using different resampling method, I figured out in the results section after using the tool 'Netcdf to table view' that it also uses Nearest neighbour method. This is not an answer to your question, but just something I found.

Aggregate time from 15-minute interval to single hour in SQL

I have below table structure in SQL Server:
StartDate Start End Sales
==============================================
2020-08-25 00:00:00 00:15:00 291.4200
2020-08-25 00:15:00 00:30:00 401.1700
2020-08-25 00:30:00 00:45:00 308.3300
2020-08-25 00:45:00 01:00:00 518.3200
2020-08-25 01:00:00 01:15:00 247.3700
2020-08-25 01:15:00 01:30:00 115.4700
2020-08-25 01:30:00 01:45:00 342.3800
2020-08-25 01:45:00 02:00:00 233.0900
2020-08-25 02:00:00 02:15:00 303.3400
2020-08-25 02:15:00 02:30:00 11.9000
2020-08-25 02:30:00 02:45:00 115.2400
2020-08-25 02:45:00 03:00:00 199.5200
2020-08-25 06:00:00 06:15:00 0.0000
2020-08-25 06:15:00 06:30:00 45.2400
2020-08-25 06:30:00 06:45:00 30.4800
2020-08-25 06:45:00 07:00:00 0.0000
2020-08-25 07:00:00 07:15:00 0.0000
2020-08-25 07:15:00 07:30:00 69.2800
Is there a way to group above data into one hour interval instead of 15 minute interval?
It has to be based on start and end columns.
Thanks,
Maybe something like the following using datepart?
select startdate, DatePart(hour,start) [Hour], Sum(sales) SalesPerHour
from t
group by startdate, DatePart(hour,start)

Overlap in seconds between datetime range and a time range

I have a dataframe like this:
df11 = pd.DataFrame(
{
"Start_date": ["2018-01-31 12:00:00", "2018-02-28 16:00:00", "2018-02-27 22:00:00"],
"End_date": ["2019-01-31 21:45:00", "2019-03-24 22:00:00", "2018-02-28 01:00:00"],
}
)
Start_date End_date
0 2018-01-31 12:00:00 2019-01-31 21:45:00
1 2018-02-28 16:00:00 2019-03-24 22:00:00
2 2018-02-27 22:00:00 2018-02-28 01:00:00
I need to check the overlap time duration in specific periods in seconds. My expected results are like this:
Start_date End_date 12h-16h 16h-22h 22h-00h 00h-02h30
0 2018-01-31 12:00:00 2019-01-31 21:45:00 14400 20700 0 0
1 2018-02-28 16:00:00 2019-03-24 22:00:00 0 21600 0 0
2 2018-02-27 22:00:00 2018-02-28 01:00:00 0 0 7200 3600
I know it`s completely wrong and I´ve tried other solutions. This is one of my attempts:
df11['12h-16h']=np.where(df11['Start_date']<timedelta(hours=16, minutes=0, seconds=0) & df11['End_date']>timedelta(hours=12, minutes=0, seconds=0),(np.minimum(df11['End_date'],timedelta(hours=16, minutes=0, seconds=0)))-(np.maximum(df11['Start_date'],timedelta(hours=12, minutes=0, seconds=0)))

Find SUM of DATEDIFF on distinct pairs grouped by UserID?

So I have a command that looks like this:
SELECT
UserID,
FacilityMMXID,
ScheduleDate,
StartTime,
EndTime
FROM TblPASchedule
WHERE UserID = 244 AND MONTH(ScheduleDate) = 03 AND Year(ScheduleDate) = 2017
The output looks like this
UserID FacilityMMXID ScheduleDate StartTime EndTime
----------- ------------- ------------ ---------------- ----------------
244 1 2017-03-17 01:00:00 05:00:00
244 2 2017-03-17 01:00:00 05:00:00
244 3 2017-03-17 01:00:00 05:00:00
244 4 2017-03-17 01:00:00 05:00:00
244 5 2017-03-17 01:00:00 05:00:00
244 6 2017-03-17 01:00:00 05:00:00
244 7 2017-03-17 01:00:00 05:00:00
244 8 2017-03-17 01:00:00 05:00:00
244 9 2017-03-17 01:00:00 05:00:00
244 10 2017-03-17 01:00:00 05:00:00
244 11 2017-03-17 01:00:00 05:00:00
244 12 2017-03-17 01:00:00 05:00:00
244 13 2017-03-17 01:00:00 05:00:00
244 14 2017-03-17 01:00:00 05:00:00
244 15 2017-03-17 01:00:00 05:00:00
244 1 2017-03-17 05:00:00 22:00:00
244 2 2017-03-17 05:00:00 22:00:00
244 3 2017-03-17 05:00:00 22:00:00
244 4 2017-03-17 05:00:00 22:00:00
244 5 2017-03-17 05:00:00 22:00:00
244 6 2017-03-17 05:00:00 22:00:00
244 7 2017-03-17 05:00:00 22:00:00
244 8 2017-03-17 05:00:00 22:00:00
244 9 2017-03-17 05:00:00 22:00:00
244 10 2017-03-17 05:00:00 22:00:00
244 11 2017-03-17 05:00:00 22:00:00
244 12 2017-03-17 05:00:00 22:00:00
244 13 2017-03-17 05:00:00 22:00:00
244 14 2017-03-17 05:00:00 22:00:00
244 15 2017-03-17 05:00:00 22:00:00
I left out the ID row as it really isn't important in this case.
Also- yes- I realize that this table is very very redundant- It isn't something I can currently fix as I am not allowed to- I can only work on getting the aforementioned summing function working.
The end goal is to pair off the distinct StartTime and EndTime pairs and then find the date difference of those- and then, for the entire month- find the sum of all the entries.
This is as far as I have gotten:
Using:
SELECT
UserID,
DATEDIFF(HOUR, StartTime, EndTime) AS 'Hours Worked'
FROM TblPASchedule WHERE UserID = 244 AND MONTH(ScheduleDate) = 03 AND Year(ScheduleDate) = 2017
GROUP BY UserId, StartTime, EndTime
I get the output to be:
UserID Hours Worked
----------- ------------
244 4
244 17
But I am not too sure about where I should go from here.
I eventually need to make it group these sums based on the UserIDs, but one step at a time I suppose. I am using a where clause to work with a single id for now...
This query gets all the distinct sets of UserID, Starttime and Endtime
;WITH CTE AS
(SELECT DISTINCT UserID, StartTime, EndTime FROM [dbo].[TblPASchedule])
SELECT SUM(DATEDIFF(MINUTE, StartTime, EndTime))/60.0 AS 'Hours Worked', UserID
FROM CTE GROUP BY UserID
RESULTS look like this
Hours Worked UserID
1.666666 19
1.233333 37
0.500000 38
Have you tried wrapping additional sub query on top of your groups?
SELECT UserId, SUM('Hours Worked') as 'Hours Worked' FROM (
SELECT
UserID,
DATEDIFF(HOUR, StartTime, EndTime) AS 'Hours Worked'
FROM TblPASchedule WHERE UserID = 244 AND MONTH(ScheduleDate) = 03 AND Year(ScheduleDate) = 2017
GROUP BY UserId, StartTime, EndTime
) AS temp
GROUP BY UserId