I've done a substracting of time variables (sleeptime = waketime - bedtime) and, although I get the correct result I need to categorize the sleeptime into 2 categories (sleep =0 if sleeptime => 7hours or sleep=1 if < than 7h).
The problem is that when I categorize the variable, I don't get the classification right. This is what I get:
bedtime waketime sleeptime sleep
22:00:00 07:00:00 09:00:00 1
22:30:00 06:30:00 08:00:00 1
00:55:00 08:10:00 07:15:00 0
02:30:00 08:30:00 06:00:00 1
Here's the code I've used:
data have; set want;
sleeptime = waketime - bedtime;
if sleeptime => '07:00:00't then sleep=0;
if sleeptime < '07:00:00't then sleep=1; run;
I've been think into converting the sleeptime into a value so that it's easier to categorize, for example:
bedtime waketime sleeptime sleeptime1
22:00:00 07:00:00 09:00:00 9
22:30:00 06:30:00 08:00:00 8
02:30:00 08:30:00 06:00:00 6
Any thoughts? Thanks for the help!
Time variables are numeric, so you're fine leaving it alone... but you're forgetting about midnight!
Either keep your variables as datetime (which keeps the date, so it lets you do this sort of thing just as you did it), or fudge it:
data have;
input bedtime :time8. waketime :time8.;
datalines;
22:00:00 07:00:00
22:30:00 06:30:00
00:55:00 08:10:00
02:30:00 08:30:00
;;;;
run;
data want;
set have;
sleeptime = waketime-bedtime + (86400*(bedtime gt waketime));
format bedtime waketime sleeptime time8.;
run;
This only works if you're sure it's always going to be true that waketime should be after bedtime. Seems likely, but worth pointing out. (And, 86400 is the number of seconds in 24 hours - you can also use '24:00:00't if you want.)
Related
details of the raw data (Mnth.nc)
netcdf Mnth {
dimensions:
time = UNLIMITED ; // (480 currently)
bnds = 2 ;
longitude = 25 ;
latitude = 33 ;
variables:
double time(time) ;
time:standard_name = "time" ;
time:long_name = "verification time generated by wgrib2 function verftime()" ;
time:bounds = "time_bnds" ;
time:units = "seconds since 1970-01-01 00:00:00.0 0:00" ;
time:calendar = "standard" ;
time:axis = "T" ;
double time_bnds(time, bnds) ;
double longitude(longitude) ;
longitude:standard_name = "longitude" ;
longitude:long_name = "longitude" ;
longitude:units = "degrees_east" ;
longitude:axis = "X" ;
double latitude(latitude) ;
latitude:standard_name = "latitude" ;
latitude:long_name = "latitude" ;
latitude:units = "degrees_north" ;
latitude:axis = "Y" ;
float APCP_sfc(time, latitude, longitude) ;
APCP_sfc:long_name = "Total Precipitation" ;
APCP_sfc:units = "kg/m^2" ;
APCP_sfc:_FillValue = 9.999e+20f ;
APCP_sfc:missing_value = 9.999e+20f ;
APCP_sfc:cell_methods = "time: sum" ;
APCP_sfc:short_name = "APCP_surface" ;
APCP_sfc:level = "surface" ;
}
Detail information of the raw data (Mnth.nc)
File format : NetCDF4 classic
-1 : Institut Source T Steptype Levels Num Points Num Dtype : Parameter ID
1 : unknown unknown v instant 1 1 825 1 F32 : -1
Grid coordinates :
1 : lonlat : points=825 (25x33)
longitude : 87 to 89.88 by 0.12 degrees_east
latitude : 25.08 to 28.92 by 0.12 degrees_north
Vertical coordinates :
1 : surface : levels=1
Time coordinate : 480 steps
RefTime = 1970-01-01 00:00:00 Units = seconds Calendar = standard Bounds = true
YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss YYYY-MM-DD hh:mm:ss
1980-01-16 12:30:00 1980-02-15 12:30:00 1980-03-16 12:30:00 1980-04-16 00:30:00
1980-05-16 12:30:00 1980-06-16 00:30:00 1980-07-16 12:30:00 1980-08-16 12:30:00
1980-09-16 00:30:00 1980-10-16 12:30:00 1980-11-16 00:30:00 1980-12-16 12:30:00
1981-01-16 12:30:00 1981-02-15 00:30:00 1981-03-16 12:30:00 1981-04-16 00:30:00
1981-05-16 12:30:00 1981-06-16 00:30:00 1981-07-16 12:30:00 1981-08-16 12:30:00
1981-09-16 00:30:00 1981-10-16 12:30:00 1981-11-16 00:30:00 1981-12-16 12:30:00
1982-01-16 12:30:00 1982-02-15 00:30:00 1982-03-16 12:30:00 1982-04-16 00:30:00
1982-05-16 12:30:00 1982-06-16 00:30:00 1982-07-16 12:30:00 1982-08-16 12:30:00
1982-09-16 00:30:00 1982-10-16 12:30:00 1982-11-16 00:30:00 1982-12-16 12:30:00
1983-01-16 12:30:00 1983-02-15 00:30:00 1983-03-16 12:30:00 1983-04-16 00:30:00
1983-05-16 12:30:00 1983-06-16 00:30:00 1983-07-16 12:30:00 1983-08-16 12:30:00
1983-09-16 00:30:00 1983-10-16 12:30:00 1983-11-16 00:30:00 1983-12-16 12:30:00
1984-01-16 12:30:00 1984-02-15 12:30:00 1984-03-16 12:30:00 1984-04-16 00:30:00
1984-05-16 12:30:00 1984-06-16 00:30:00 1984-07-16 12:30:00 1984-08-16 12:30:00
1984-09-16 00:30:00 1984-10-16 12:30:00 1984-11-16 00:30:00 1984-12-16 12:30:00
................................................................................
............................
2016-01-16 12:30:00 2016-02-15 12:30:00 2016-03-16 12:30:00 2016-04-16 00:30:00
2016-05-16 12:30:00 2016-06-16 00:30:00 2016-07-16 12:30:00 2016-08-16 12:30:00
2016-09-16 00:30:00 2016-10-16 12:30:00 2016-11-16 00:30:00 2016-12-16 12:30:00
2017-01-16 12:30:00 2017-02-15 00:30:00 2017-03-16 12:30:00 2017-04-16 00:30:00
2017-05-16 12:30:00 2017-06-16 00:30:00 2017-07-16 12:30:00 2017-08-16 12:30:00
2017-09-16 00:30:00 2017-10-16 12:30:00 2017-11-16 00:30:00 2017-12-16 12:30:00
2018-01-16 12:30:00 2018-02-15 00:30:00 2018-03-16 12:30:00 2018-04-16 00:30:00
2018-05-16 12:30:00 2018-06-16 00:30:00 2018-07-16 12:30:00 2018-08-16 12:30:00
2018-09-16 00:30:00 2018-10-16 12:30:00 2018-11-16 00:30:00 2018-12-16 12:30:00
2019-01-16 12:30:00 2019-02-15 00:30:00 2019-03-16 12:30:00 2019-04-16 00:30:00
2019-05-16 12:30:00 2019-06-16 00:30:00 2019-07-16 12:30:00 2019-08-16 12:30:00
2019-09-16 00:30:00 2019-10-16 12:30:00 2019-11-16 00:30:00 2019-12-16 12:30:00
2020-01-16 12:30:00 2020-02-15 12:30:00 2020-03-16 12:30:00 2020-04-16 00:30:00
2020-05-16 12:30:00 2020-06-16 00:30:00 2020-07-16 12:30:00 2020-08-16 12:30:00
2020-09-16 00:30:00 2020-10-16 12:30:00 2020-11-16 00:30:00 2020-12-16 12:30:00
cdo sinfo: Processed 1 variable over 480 timesteps [0.50s 30MB].
I extracted monthly rainfall values from the Mnth.nc file for a location (lon: 88.44; lat: 27.12)using the following command
cdo remapnn,lon=88.44-lat=27.12 Mnth.nc Mnth1.nc
cdo outputtab,year, month, value Mnth1.nc > Mnth.csv
The output is as follows ()
Year month Value
1980 1 31.74219
1980 2 54.60938
1980 3 66.94531
1980 4 149.4062
1980 5 580.7227
1980 6 690.1328
1980 7 1146.305
1980 8 535.8164
1980 9 486.4688
1980 10 119.5391
1980 11 82.10547
1980 12 13.95703
Then I extracted the rainfall values from the same data (Mnth.nc) for the same location (lon: 88.44; lat: 27.12) using the features of the multidimensional toolbox provided in ArcGIS. The result is as follows-
year month Value
1980 1 38.8125
1980 2 58.6542969
1980 3 71.7382813
1980 4 148.6367188
1980 5 564.7070313
1980 6 653.0390625
1980 7 1026.832031
1980 8 501.3164063
1980 9 458.5429688
1980 10 113.078125
1980 11 74.0976563
1980 12 24.2265625
Why I'm getting different results in two different software for the same location and for the same variable? Any help will highly be appreciated.
Thanks in advance.
The question is perhaps misleading, in that you are not "extracting" the data in both cases. Instead you are interpolating it. The method used by CDO is nearest neighbour. arcGIS is probably simply using a different method, so you should get different results. They should give slightly different results.
The results look very similar, so both are almost certainly working as advertised.
I think I ended up in the same issues. I used CDO to extract a point and also used ArcGIS for cross checking. I found out that the values were different.
Just to be sure, I recorded the location extent of one particular cell and tried extracting values for different locations within the cell boundary extent. CDO seemed to have been giving the same results as expected because it uses nearest neighbour resampling method.
Then I tried the same with ArcGIS. Interestingly, in my case, I found out that ArcGIS also gave me same results sometimes within the same cell boundary extent and sometimes different. I checked the values by also using 'Panoply' and I realised that CDO gave accurate results, while ArcGIS was sometimes giving offset results,i.e., it was giving the values of the nearby cells. This was confirmed by cross-checking with Panoply. As #Robert Wilson mentioned that ArcGIS must be using different resampling method, I figured out in the results section after using the tool 'Netcdf to table view' that it also uses Nearest neighbour method. This is not an answer to your question, but just something I found.
What would be the best way to get datetime ranges between records in SQL Server? I think it would be easiest to explain with an example.
I have the following data - these records start and end datetime ranges would never overlap:
ID
Start
End
1
1/27/2021 06:00:00
1/27/2021 09:00:00
2
1/27/2021 10:00:00
1/27/2021 14:00:00
3
1/27/2021 21:00:00
1/28/2021 04:00:00
4
1/28/2021 06:00:00
1/28/2021 09:00:00
I need to get the date time range between records. So the resulting SQL query would return the following result set (ID doesn't matter):
ID
Start
End
1
1/27/2021 09:00:00
1/27/2021 10:00:00
2
1/27/2021 14:00:00
1/27/2021 21:00:00
3
1/28/2021 04:00:00
1/28/2021 06:00:00
Thanks for any help in advance.
Use lead():
select t.*
from (select id, end as start, lead(start) over (order by start) as end
from t
) t
where end is not null;
Note: end is a lousy name for a column, given that it is a SQL keyword. I assume it is for illustrative purposes only.
Here is a SQL Fiddle.
Goal is to compute delta between two times, each in separate DF columns and in 24-Hour clock format, and add to a new column "triptime"
Here is my input code, which has no dates, just 24hour clock strings.
df = pd.DataFrame({'DepartureTime': ['2330', '1700', '0900'], 'ArrivalTime': ['0030','1900','1100']})
Here is my attempt
df['DepartureTime'] = pd.to_datetime(df.DepartureTime, format='%H%M')
df['ArrivalTime'] = pd.to_datetime(df.ArrivalTime, format='%H%M')
df['triptime'] = df.ArrivalTime - df.DepartureTime
Which outputs a problem as can be seen in the first row below. Unfortunately my pipeline data assumes no change in dates. Any guidance on how I can have the triptime column showing the actual trip time, without prefix of days?
IIUC you can add astype() to return only the difference in hours.
df['triptime'] = (df.ArrivalTime - df.DepartureTime).astype('timedelta64[h]')
#output
DepartureTime ArrivalTime triptime
0 1900-01-01 23:30:00 1900-01-01 00:30:00 -23.0
1 1900-01-01 17:00:00 1900-01-01 19:00:00 2.0
2 1900-01-01 09:00:00 1900-01-01 11:00:00 2.0
One way to get the interval when the day turns is to select all values less than zero and add 24. Apparently it solves the problem but it is not something I like. It seems highly susceptible to errors.
df.loc[df['triptime'] < 0, 'triptime'] = df['triptime'] + 24
#output
DepartureTime ArrivalTime triptime
0 1900-01-01 23:30:00 1900-01-01 00:30:00 1.0
1 1900-01-01 17:00:00 1900-01-01 19:00:00 2.0
2 1900-01-01 09:00:00 1900-01-01 11:00:00 2.0
The most correct and fail-safe way would be to have, in addition to the time of departure and arrival, the entire dates
If after calculations you want to remove the dates and keep only the hours, use .dt.time
df['DepartureTime'] = df['DepartureTime'].dt.time
df['ArrivalTime'] = df['ArrivalTime'].dt.time
#output
DepartureTime ArrivalTime triptime
0 23:30:00 00:30:00 1.0
1 17:00:00 19:00:00 2.0
2 09:00:00 11:00:00 2.0
I am trying to write a code in SAS. I have a dataset as follows:
data one;
input CLI date date9. time time8. ;
format date date9. time hhmm8. ;
cards;
5 01apr2014 10:00:00
6 01apr2014 11:00:00
10 01apr2014 12:00:00
4 02Apr2014 10:00:00
20 02apr2014 11:00:00
12 02apr2014 12:00:00
;
run;
I would like to obtain a dataset as follows:
data two;
date time New_cli
01apr2014 10:00:00 1
01apr2014 10:00:00 1
01apr2014 10:00:00 1
01apr2014 10:00:00 1
01apr2014 10:00:00 1
01apr2014 11:00:00 1
01apr2014 11:00:00 1
01apr2014 11:00:00 1
01apr2014 11:00:00 1
01apr2014 11:00:00 1
01apr2014 11:00:00 1
.
.
.
02Apr2014 10:00:00 1
02Apr2014 10:00:00 1
02Apr2014 10:00:00 1
02Apr2014 10:00:00 1
.
.
As it is, each observation in data "one" should be repeated CLI times in "two"(e.g. the first observation in one 1/04 10 am should be repeated 5 times in two, the second one 6 etc..)
There is some one that could help me? many thanks!
Use a do loop from 1 to CLI, and use an output statement within the loop to output a row for every iteration of the loop. SAS will automatically resolve CLI to the value that it holds, and run the do loop exactly that many times.
data want;
set have;
do i = 1 to CLI;
new_cli = 1;
output;
end;
drop i;
run;
How can I get the Start and End time of this list? I can add date to this time and can get by min and max but you can see row 3 have next day shift but it will come under same date because it is night shift
I have added normal day shift employee also get the logic right
EmployeeId ShiftDate ShiftStartTime ShiftEndTime
-----------------------------------------------------
20040 2017-11-01 21:00:00 23:00:00
20040 2017-11-01 23:00:00 00:30:00
20040 2017-11-01 00:30:00 06:00:00
20124 2017-11-01 09:00:00 16:30:00
20124 2017-11-01 16:30:00 22:00:00
20124 2017-11-01 22:00:00 22:30:00
I need it like below:
EmployeeId ShiftDate ShiftStartTime ShiftEndTime
----------------------------------------------------
20040 2017-11-01 21:00:00 06:00:00
20124 2017-11-01 09:00:00 22:30:00
In a commercial environment we solved this by attaching a FLAG to each shift. The Flag would indicate the 'Reporting Date' of the Shift...The Flag would have have a value of 1 if the 'Reporting / Administrative date' was the 'next' day. 0 for the same day. -1 for the previous day (which we never used...depends on your scenario)
I modified your table to show a possible SHIFTS table, which should also have a NAME column I guess (like Morning, Afternoon, Day, Night shift etc)
ReportFlag ShiftStartTime ShiftEndTime
1 21:00:00 23:00:00
1 23:00:00 00:30:00
0 00:30:00 06:00:00
0 09:00:00 16:30:00
0 16:30:00 22:00:00
1 22:00:00 22:30:00
Notice how I added 1 - to say that 'this shift' is actually considered to be on the 'next' day.
Then you can use your flag value 0,1 to add to DATE functions in your queries too