mktime() returns an incorrect value right after entering DST - dst

The following snippet code is from rtc.c in busybox-1.22.1.
In my case the utc is always 0, so this function is just doing a conversion from struct tm to time_t.
time_t FAST_FUNC rtc_tm2time(struct tm *ptm, int utc)
{
//fprintf(stdout, "ptm->tm_hour: %d\n", ptm->tm_hour);
char *oldtz = oldtz; /* for compiler */
time_t t;
if (utc) {
oldtz = getenv("TZ");
putenv((char*)"TZ=UTC0");
tzset();
}
t = mktime(ptm); //problem here
//struct tm* temp = localtime(&t);
//fprintf(stdout, "temp->tm_hour: %d\n", temp->tm_hour);
if (utc) {
unsetenv("TZ");
if (oldtz)
{
putenv(oldtz - 3);
}
tzset();
}
return t;
}
Also, there is a file /etc/TZ displaying timezone and DST information.
~ # cat /etc/TZ
LMT0:00LMT-1:00,M8.5.1/10,M12.5.1/10
Then, I set system time to 2021/8/30, 9:59:30 (30 seconds earlier than DST start date), and sync to hwclock.
date -s 2021.08.30-09:59:30 >/dev/null 2>/dev/null //set system time
hwclock -w //sync RTC to system time
Entering hwclock continuously while observing the output on CLI.
~ # hwclock
ptm->tm_hour : 9
temp->tm_hour : 9
Mon Aug 30 09:59:58 2021 0.000000 seconds
~ # hwclock
ptm->tm_hour : 10
temp->tm_hour : 11 //why not 10?
Mon Aug 30 11:00:00 2021 0.000000 seconds
Why the return value from mktime is added by 1 when entering DST? Shouldn't it be affected by DST?

According to the mktime() man pages, mktime() is allowed to update the tm_isdst value. The starting value may cause the mktime() algorithm to branch differently:
The value specified in the tm_isdst field informs mktime() whether or not
daylight saving time (DST) is in effect for the time supplied in the tm
structure: a positive value means DST is in effect; zero means that DST is not in
effect; and a negative value means that mktime() should (use timezone information
and system databases to) attempt to determine whether DST is in effect at the
specified time.
and then update the value of tm_isdst accordingly:
tm_isdst is set (regardless of its initial value) to a positive value or to 0,
respectively, to indicate whether DST is or is not in effect at the specified time.
In other words, I'd check the tm_isdst value before and after the mktime() call.
One way I've dealt with this in the past is to
call mktime() with a known value of tm_isdst (e.g. zero or one)
call localtime() on the returned time_t value, then check tm_isdst on the struct tm pointer localtime() returns.
if the tm_isdst has been changed from the prior known value, change the original struct tm to use the new tm_isdt value then call mktime() again with it before trusting the time_t it returns.
It's definitely less efficient but it's possible to know when the DST change occurs by expecting it and checking for it.
Another option would be to set tm_isdst to -1 before calling mktime() and trusting its lookup of timezone and set tm_isdst appropriately.

Related

How do I convert a UNIX timestamp to ISO in CMake

I have a UNIX-style timestamp that looks like 1587405820 -0600 which I would like to convert to an ISO style format, something like YYYY-MM-DDTHH:MM:SSZ
CMake has a string(TIMESTAMP ...) command at https://cmake.org/cmake/help/v3.12/command/string.html#timestamp but this only gets me the current time in a formatted string which does not work for my application. I need to be able to convert an existing time into ISO format.
Is there a way to do this?
UPDATE
Based on #squareskittles answer, here's what I ended up with this test which is doing the right thing:
# Check that we get the current timestamp
string(TIMESTAMP TIME_T UTC)
message(STATUS ">>> T1: ${TIME_T}")
# Get the ISO string from our specific timestamp
set(ENV{SOURCE_DATE_EPOCH} 1587405820)
string(TIMESTAMP TIME_T UTC)
unset(ENV{SOURCE_DATE_EPOCH})
message(STATUS ">>> T2: ${TIME_T}")
# Check that we get the current timestamp again correctly
string(TIMESTAMP TIME_T UTC)
message(STATUS ">>> T3: ${TIME_T}")
Which gives me this output:
-- >>> T1: 2020-04-22T15:08:13Z
-- >>> T2: 2020-04-20T18:03:40Z
-- >>> T3: 2020-04-22T15:08:13Z
If you want this function to use a specific time other than than the current time, you can set the environment variable SOURCE_DATE_EPOCH to the UNIX-style timestamp (integer):
# Set the environment variable to a specific timestamp.
set(ENV{SOURCE_DATE_EPOCH} 1587405820)
# Convert to ISO format, and print it.
string(TIMESTAMP MY_TIME)
message(STATUS ${MY_TIME})
prints (for UTC -0600):
2020-04-20T12:03:40
If you need to adjust this time to UTC time, you can add the UTC argument:
set(ENV{SOURCE_DATE_EPOCH} 1587405820)
string(TIMESTAMP MY_TIME UTC)
message(STATUS ${MY_TIME})
prints:
2020-04-20T18:03:40Z
Note: If this SOURCE_DATE_EPOCH variable is used elsewhere in your CMake code, it is best to save the SOURCE_DATE_EPOCH value before modifying it, so it can be set back to its previous value when complete.

When calling expire or pexpire in redis, is time rounded up?

I am calling expire for an existing redis key. Let's say I am passing 5 for the value. If the key already exists and is 4.75 seconds away from expiring, does it stay at 4.75 seconds or is it rounded back up to 5 seconds?
I can use pexpire to get more granularity, but there is still a rounding problem with partial milliseconds - unless milliseconds is the smallest granularity in redis...
If it helps, here is my rate limit script, which takes a key, an amount to increment and a millisecond rate limit window, which keeps decrementing until the key drops out, at which point the next call adds the key and sets a fresh expire time. The new incremented value is then returned.
local f,k,a,b,c c=ARGV[2] f=redis.call k=KEYS[1] a=f('incrby',k,ARGV[1]) b=f('pttl',k) f('pexpire',k,math.min(b<0 and c or b,c)) return a
UPDATE
New rate limit script that does not have partial time issue, it only sets expire if the key does not have an expire set at all:
local f,k,a,b f=redis.call k=KEYS[1] a=f('incrby',k,ARGV[1]) b=f('pttl',k) if b<0 then f('pexpire',k,ARGV[2]) end return a
Does it stay at 4.75 seconds or is it rounded back up to 5 seconds?
It is back to full 5 seconds TTL.
unless milliseconds is the smallest granularity in redis...
It is milliseconds, for version 2.6 or greater
See Expire accuracy
In Redis 2.4 the expire might not be pin-point accurate, and it could
be between zero to one seconds out. Since Redis 2.6 the expire error
is from 0 to 1 milliseconds.
And
Keys expiring information is stored as absolute Unix timestamps (in milliseconds in case of Redis version 2.6 or greater).
You can play with some Lua scripts if you want to verify
EVAL "local result = {'Time at start', 0, 'Expires in (ms)', 0, 'Time at end', 0} \n result[2] = redis.call('TIME') \n redis.call('EXPIRE', KEYS[1], ARGV[1]) \n result[4] = redis.call('PTTL', KEYS[1]) \n result[6] = redis.call('TIME') \n return result" 1 myKey 5
Friendly view of the script:
local result = {'Time at start', 0, 'Expires in (ms)', 0, 'Time at end', 0}
result[2] = redis.call('TIME')
redis.call('EXPIRE', KEYS[1], ARGV[1])
result[4] = redis.call('PTTL', KEYS[1])
result[6] = redis.call('TIME')
return result

AWK complains about number of fields when extracting variables

I have a script to parse a TeamCity directory map file. The script works, but I want to know why refactoring it into using variables breaks it with a seemingly unrelated error message and how I can still have it work using variables.
MAP=/opt/TeamCity/buildAgent/work/directory.map
sed -n -e '1,3d;1,/#/{/#/!p}' $MAP | \
awk ' {
n=split($0, array, "->");
printf(substr(array[1], 6) substr(array[2],2,16) "\n");
}
'
This prints
nicecorp::Master 652293808ace4eb5
nicecorp::Reset Database 652293808ace4eb5
nicecorp::test-single-steps 652293808ace4eb5
nicecorp::Develop 652293808ace4eb5
nicecorp::Pull Requests 652293808ace4eb5
Which is pretty much what I want.
The refactoring that breaks
But then I was trying to extract the sub strings into variables, and the script broke. I changed the last printf statement into this
proj=substr(array[1], 6);
tcdir=substr(array[2],2,16);
printf($proj" " $tcdir);
That just prints this error, although I thought it was more or less the same?
awk: program limit exceeded: maximum number of fields size=32767
FILENAME="-" FNR=1 NR=1
This error seems a bit weird, given that my total input is about 500 bytes, 60 times less than the limit they complain about with regards to fields.
AWK version: mawk (1994)
Data format ($ head -10 directory.map):
#Don't edit this file!
#Nov 5, 2019 1:49:26 PM UTC
--version=2
bt30=nicecorp::Master -> 652293808ace4eb5 |?| Oct 29, 2019 4:14:27 PM UTC |:| default
bt32=nicecorp::Reset Database -> 652293808ace4eb5 |?| Oct 30, 2019 1:01:48 PM UTC |:| default
bt33=nicecorp::test-single-steps -> b96874cc9acaf874 |?| Nov 4, 2019 4:20:13 PM UTC |:| default
bt33=nicecorp::test-single-steps -> 652293808ace4eb5 |?| Nov 5, 2019 9:00:37 AM UTC |:| default
bt28=nicecorp::Develop -> 652293808ace4eb5 |?| Nov 5, 2019 1:07:53 PM UTC |:| default
bt29=nicecorp::Pull Requests -> 652293808ace4eb5 |?| Nov 5, 2019 1:18:08 PM UTC |:| default
#
The source of the problem is that the print statement in the refactor is using shell notation for variable ($proj instead of proj, $tcdir instead of tcdir).
When those values are numeric (e.g., tcdir=652293808ace4eb5 for the first line), awk (mawk in this case) will try to print 652293808-th column. Current version of gawk will not fail here - they will realize there are only few columns, and will show empty string for those field (or the full line for $0, if the value is non numeric)
Older version may attempt to extend the field list array to match the requested number, resulting in limit exceeded message.
Also note two minor issues - refactored code uses proj as format - it will get confused if '%' is included. Also, missing newlines. Did you really mean printf and not print ?
Fix:
proj=substr(array[1], 6);
tcdir=substr(array[2],2,16);
# Should consider print, instead of printf
printf(proj " " tcdir "\n");
# print proj, tcdir
The problem was syntax. I was using the shell style $tcdir to insert the value of the variable instead of simply tcdir. By (some unknown to me) means the tcdir portion of $tcdir is resolved to some numeric field value, meaning I am trying to print the value of a field, not the variable tcdir.

Create timestamp with fractional seconds

awk can generate a timestamp with strftime function, e.g.
$ awk 'BEGIN {print strftime("%Y/%m/%d %H:%M:%S")}'
2019/03/26 08:50:42
But I need a timestamp with fractional seconds, ideally down to nanoseconds. gnu date can do this with the %N element:
$ date "+%Y/%m/%d %H:%M:%S.%N"
2019/03/26 08:52:32.753019800
But it is relatively inefficient to invoke date from within awk compared to calling strftime, and I need high performance as I'm processing many large files with awk and need to generate many timestamps while processing the files. Is there a way that awk can efficiently generate a timestamp that includes fractional seconds (ideally nanoseconds, but milliseconds would be acceptable)?
Adding an example of what I am trying to perform:
awk -v logFile="$logFile" -v outputFile="$outputFile" '
BEGIN {
print "[" strftime("%Y%m%d %H%M%S") "] Starting to process " FILENAME "." >> logFile
}
{
data[$1] += $2
}
END {
print "[" strftime("%Y%m%d %H%M%S") "] Processed " NR " records." >> logFile
for (id in data) {
print id ": " data[id] >> outputFile
}
}
' oneOfManyLargeFiles
If you are really in need of subsecond timing, then any call to an external command such as date or reading an external system file such as /proc/uptime or /proc/rct defeats the purpose of the subsecond accuracy. Both cases require to many resources to retrieve the requested information (i.e. the time)
Since the OP already makes use of GNU awk, you could make use of a dynamic extension. Dynamic extensions are a way of adding new functionality to awk by implementing new functions written in C or C++ and dynamically loading them with gawk. How to write these functions is extensively written down in the GNU awk manual.
Luckily, GNU awk 4.2.1 comes with a set of default dynamic libraries which can be loaded at will. One of these libraries is a time library with two simple functions:
the_time = gettimeofday()
Return the time in seconds that has elapsed since 1970-01-01 UTC as a floating-point value. If the time is unavailable on this platform, return -1 and set ERRNO. The returned time should have sub-second precision, but the actual precision may vary based on the platform. If the standard C gettimeofday() system call is available on this platform, then it simply returns the value. Otherwise, if on MS-Windows, it tries to use GetSystemTimeAsFileTime().
result = sleep(seconds)
Attempt to sleep for seconds seconds. If seconds is negative, or the attempt to sleep fails, return -1 and set ERRNO. Otherwise, return zero after sleeping for the indicated amount of time. Note that seconds may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use nanosleep() or select() to implement the delay.
source: GNU awk manual
It is now possible to call this function in a rather straightforward way:
awk '#load "time"; BEGIN{printf "%.6f", gettimeofday()}'
1553637193.575861
In order to demonstrate that this method is faster then the more classic implementations, I timed all 3 implementations using gettimeofday():
awk '#load "time"
function get_uptime( a) {
if((getline line < "/proc/uptime") > 0)
split(line,a," ")
close("/proc/uptime")
return a[1]
}
function curtime( cmd, line, time) {
cmd = "date \047+%Y/%m/%d %H:%M:%S.%N\047"
if ( (cmd | getline line) > 0 ) {
time = line
}
else {
print "Error: " cmd " failed" | "cat>&2"
}
close(cmd)
return time
}
BEGIN{
t1=getimeofday(); curtime(); t2=gettimeofday();
print "curtime()",t2-t1
t1=getimeofday(); get_uptime(); t2=gettimeofday();
print "get_uptime()",t2-t1
t1=getimeofday(); gettimeofday(); t2=gettimeofday();
print "gettimeofday()",t2-t1
}'
which outputs:
curtime() 0.00519109
get_uptime() 7.98702e-05
gettimeofday() 9.53674e-07
While it is evident that curtime() is the slowest as it loads an external binary, it is rather startling to see that awk is blazingly fast in processing an extra external /proc/ file.
If you are on Linux, you could use /proc/uptime:
$ cat /proc/uptime
123970.49 354146.84
to get some centiseconds (the first value is the uptime) and compute the time difference between the beginning and whenever something happens:
$ while true ; do echo ping ; sleep 0.989 ; done | # yes | awk got confusing
awk '
function get_uptime( a, line) {
if((getline line < "/proc/uptime") > 0)
split(line,a," ")
close("/proc/uptime")
return a[1]
}
BEGIN {
basetime=get_uptime()
}
{
if(!wut) # define here the cause
print get_uptime()-basetime # calculate time difference
}'
Output:
0
0.99
1.98
2.97
3.97

stop tcl procedure from running more regularly than once every 2 minutes, regardless of how often it is called

I have a piece of code that has a procedure, I only want the proc to trigger, at most every 2 minutes. If it has triggered within the last 2 min, then it should just exit. Currently it has a small delay before it executes (after 5000), but what this seems to do as well, is queue any other execution requests that have occurred in the quiesce time (5 seconds) and then just pummel the queued commands out in a flurry of activity.
Obviously this is missing a significant portion, but
I have considered doing something with variables like:
#gets current time since epoch as var0
set timer0 clock format [clock seconds] -format %s
#gets current time since epoch as var1
set timer1 clock format [clock seconds] -format %s
#calculates elapsed time since epoch in seconds, converts seconds to minutes
set split [ expr (($timer0 - $timer1)/60)/60 ]
if { $split > 2 } {
my_proc $maybe_1var $maybe_2vars $maybe_3vars
} else {
exit super gracefully
}
I can provide snippets of my current code if you like. It is just not nearly as elegant as I imagine it could be, and I am not sure if there are better ways of doing this in Tcl.
One possible solution:
set lastExecuted ""
proc foo {} {
global lastExecuted
set currentTimestamp [clock scan seconds]
if {$lastExecuted != "" && ($currentTimestamp - $lastExecuted) > 120} {
exit
}
set lastExecuted $currentTimestamp
# continue execution
}