I'm struggling with a while() loop in a Conky script.
Here's what I want to do :
I'm tunneling a command output to awk, extracting and formatting data.
The problem is : the output could contain 1 to n sections, and I want to get values from each one of them.
Here's the output sent to awk :
1) -----------
name: wu_1664392603_228876_0
WU name: wu_1664392603_228876
project URL: https://boinc.loda-lang.org/loda/
received: Thu Oct 6 15:31:40 2022
report deadline: Thu Oct 13 15:31:40 2022
ready to report: no
state: downloaded
scheduler state: scheduled
active_task_state: EXECUTING
app version num: 220917
resources: 1 CPU
estimated CPU time remaining: 1379.480287
elapsed task time: 5858.009798
slot: 1
PID: 2221366
CPU time at last checkpoint: 5690.500000
current CPU time: 5712.920000
fraction done: 0.809000
swap size: 1051 MB
working set size: 973 MB
2) -----------
name: wu_1664392603_228908_0
WU name: wu_1664392603_228908
project URL: https://boinc.loda-lang.org/loda/
received: Thu Oct 6 15:31:53 2022
report deadline: Thu Oct 13 15:31:53 2022
ready to report: no
state: downloaded
scheduler state: scheduled
active_task_state: EXECUTING
app version num: 220917
resources: 1 CPU
estimated CPU time remaining: 1393.925106
elapsed task time: 5849.961764
slot: 7
PID: 2221367
CPU time at last checkpoint: 5654.640000
current CPU time: 5682.160000
fraction done: 0.807000
swap size: 802 MB
working set size: 728 MB
...
And here's the final output I want :
boinc.loda wu_1664392603_2288 80.9 07/10 01h37
boinc.loda wu_1664392603_2289 80.7 07/10 02h38
I managed to get the data I want ("WU name", "project URL", "estimated CPU time remaining" AND "fraction done") from one particuliar section using this code :
${execi 60 boinccmd --get_tasks | awk -F': |://|/' '\
/URL/ && ++i==1 {u=$3}\
/WU/ && ++j==1 {w=$2}\
/fraction/ && ++k==1 {p=$2}\
/estimated/ && ++l==1 {e=strftime("%d/%m %Hh%M",$2+systime())}\
END {printf "%.10s %.18s %3.1f %s", u, w, p*100, e}\
'}
This is quite inelegant, as I must repeat this code nth times, increasing i,j,k,l values to get the whole dataset (n is related to CPU threads, my PC has 8 threads, so I repeat the code 8 times).
I'd like the script to adapt to other CPUs, where n could be anything from 1 to ...
The obvious solution is to use a while() loop, parsing the whole dataset.
But nesting a conditional loop into an awk sequence calling an external command seems too tricky for me, and Conky scripts aren't really easy to debug, as Conky may hang without any error output or log if the script's syntax is bad.
Any help will be appreciated :)
Assumptions:
the sample input shows 2 values for estimated that are ~14.5 seconds apart (1379.480287 and 1393.925106) but the expected output is showing the estimated values as being ~61 mins apart (07/10 01h37 and 07/10 02h38); for now I'm going to assume this is due to OP's repeated runs of execi returning widely varying values for the estimated lines
each section of execi output always contains 4 matching lines (URL, WU, fraction, estimated) and these 4 strings only occur once within a section of execi output
I don't have execi installed on my system so to emulate OP's exceci I've cut-n-pasted OP's sample execi results into a local file named execi.dat.
Tweaking OP's current awk script that also allows us to eliminate the need for a bash loop (that repeatedly calls execi | awk):
cat execi.dat | awk -F': |://|/' '
FNR==NR { st=systime() }
/URL/ { found++; u=$3 }
/WU/ { found++; w=$2 }
/fraction/ { found++; p=$2 }
/estimated/ { found++; e=strftime("%d/%m %Hh%M",$2+st) }
found==4 { printf "%.10s %.18s %3.1f %s\n", u, w, p*100, e; found=0 }
'
This generates:
boinc.loda wu_1664392603_2288 80.9 06/10 17h47
boinc.loda wu_1664392603_2289 80.7 06/10 17h47
NOTE: the last value appears to be duplicated but that's due to the sample estimated values only differing by ~14.5 seconds
Related
I need to create edges between a set of nodes but it is not guaranteed that the edge is not exists already, I need to know which edges has been created so I can increment the edges counter for the two connected nodes.
I want to know the edges count for every node without querying the graph each time.
Example:
MERGE (u:user {id:999049043279872})
MERGE (g1:group {id:346709075951616})
MERGE (g2:group {id:346709075951617})
MERGE (g1)-[m1:member]->(u)
MERGE (g2)-[m2:member]->(u)
Sometimes the user is already a member of the group so I don't want to increment the counter in this case.
I tried to use the result statistics but it returns the created relationships number only, I thought also about using a map and then fill the content using ON CREATE SET after MERGE:
WITH {g1:0, g2:0} as res
MERGE (u:user {id:999049043279872})
MERGE (g1:group {id:346709075951616})
MERGE (g2:group {id:346709075951617})
MERGE (g1)-[m1:member]->(u)
ON CREATE SET res.g1 = 1
MERGE (g2)-[m2:member]->(u)
ON CREATE SET res.g2 = 1
RETURN res
But it does not works; the server crashes immediately after executing the query.
Exception:
------ FAST MEMORY TEST ------
17235:M 28 Feb 2022 16:56:50.016 # main thread terminated
17235:M 28 Feb 2022 16:56:50.017 # Bio thread for job type #0 terminated
17235:M 28 Feb 2022 16:56:50.017 # Bio thread for job type #1 terminated
17235:M 28 Feb 2022 16:56:50.018 # Bio thread for job type #2 terminated
Fast memory test PASSED, however your memory can still be broken.
Please run a memory test for several hours if possible.
------ DUMPING CODE AROUND EIP ------
Symbol: (null) (base: (nil))
Module: /lib/x86_64-linux-gnu/libc.so.6 (base 0x7fbfe3dcc000)
$ xxd -r -p /tmp/dump.hex /tmp/dump.bin
$ objdump --adjust-vma=(nil) -D -b binary -m i386:x86-64 /tmp/dump.bin
=== REDIS BUG REPORT END. Make sure to include from START to END. ===
Please report the crash by opening an issue on github:
http://github.com/redis/redis/issues
Suspect RAM error? Use redis-server --test-memory to verify it.
Segmentation fault
Any ideas?
Thanks in advance
Neo4j stores already a counter inside each node to count the number of relationships and to provide a fast count access. When you want to get the number of members in a group, you can simply do:
MATCH (g:group)
return size((g)<-[:member]-())
awk can generate a timestamp with strftime function, e.g.
$ awk 'BEGIN {print strftime("%Y/%m/%d %H:%M:%S")}'
2019/03/26 08:50:42
But I need a timestamp with fractional seconds, ideally down to nanoseconds. gnu date can do this with the %N element:
$ date "+%Y/%m/%d %H:%M:%S.%N"
2019/03/26 08:52:32.753019800
But it is relatively inefficient to invoke date from within awk compared to calling strftime, and I need high performance as I'm processing many large files with awk and need to generate many timestamps while processing the files. Is there a way that awk can efficiently generate a timestamp that includes fractional seconds (ideally nanoseconds, but milliseconds would be acceptable)?
Adding an example of what I am trying to perform:
awk -v logFile="$logFile" -v outputFile="$outputFile" '
BEGIN {
print "[" strftime("%Y%m%d %H%M%S") "] Starting to process " FILENAME "." >> logFile
}
{
data[$1] += $2
}
END {
print "[" strftime("%Y%m%d %H%M%S") "] Processed " NR " records." >> logFile
for (id in data) {
print id ": " data[id] >> outputFile
}
}
' oneOfManyLargeFiles
If you are really in need of subsecond timing, then any call to an external command such as date or reading an external system file such as /proc/uptime or /proc/rct defeats the purpose of the subsecond accuracy. Both cases require to many resources to retrieve the requested information (i.e. the time)
Since the OP already makes use of GNU awk, you could make use of a dynamic extension. Dynamic extensions are a way of adding new functionality to awk by implementing new functions written in C or C++ and dynamically loading them with gawk. How to write these functions is extensively written down in the GNU awk manual.
Luckily, GNU awk 4.2.1 comes with a set of default dynamic libraries which can be loaded at will. One of these libraries is a time library with two simple functions:
the_time = gettimeofday()
Return the time in seconds that has elapsed since 1970-01-01 UTC as a floating-point value. If the time is unavailable on this platform, return -1 and set ERRNO. The returned time should have sub-second precision, but the actual precision may vary based on the platform. If the standard C gettimeofday() system call is available on this platform, then it simply returns the value. Otherwise, if on MS-Windows, it tries to use GetSystemTimeAsFileTime().
result = sleep(seconds)
Attempt to sleep for seconds seconds. If seconds is negative, or the attempt to sleep fails, return -1 and set ERRNO. Otherwise, return zero after sleeping for the indicated amount of time. Note that seconds may be a floating-point (nonintegral) value. Implementation details: depending on platform availability, this function tries to use nanosleep() or select() to implement the delay.
source: GNU awk manual
It is now possible to call this function in a rather straightforward way:
awk '#load "time"; BEGIN{printf "%.6f", gettimeofday()}'
1553637193.575861
In order to demonstrate that this method is faster then the more classic implementations, I timed all 3 implementations using gettimeofday():
awk '#load "time"
function get_uptime( a) {
if((getline line < "/proc/uptime") > 0)
split(line,a," ")
close("/proc/uptime")
return a[1]
}
function curtime( cmd, line, time) {
cmd = "date \047+%Y/%m/%d %H:%M:%S.%N\047"
if ( (cmd | getline line) > 0 ) {
time = line
}
else {
print "Error: " cmd " failed" | "cat>&2"
}
close(cmd)
return time
}
BEGIN{
t1=getimeofday(); curtime(); t2=gettimeofday();
print "curtime()",t2-t1
t1=getimeofday(); get_uptime(); t2=gettimeofday();
print "get_uptime()",t2-t1
t1=getimeofday(); gettimeofday(); t2=gettimeofday();
print "gettimeofday()",t2-t1
}'
which outputs:
curtime() 0.00519109
get_uptime() 7.98702e-05
gettimeofday() 9.53674e-07
While it is evident that curtime() is the slowest as it loads an external binary, it is rather startling to see that awk is blazingly fast in processing an extra external /proc/ file.
If you are on Linux, you could use /proc/uptime:
$ cat /proc/uptime
123970.49 354146.84
to get some centiseconds (the first value is the uptime) and compute the time difference between the beginning and whenever something happens:
$ while true ; do echo ping ; sleep 0.989 ; done | # yes | awk got confusing
awk '
function get_uptime( a, line) {
if((getline line < "/proc/uptime") > 0)
split(line,a," ")
close("/proc/uptime")
return a[1]
}
BEGIN {
basetime=get_uptime()
}
{
if(!wut) # define here the cause
print get_uptime()-basetime # calculate time difference
}'
Output:
0
0.99
1.98
2.97
3.97
I am using IBM LSF and trying to get usage statistics during a certain period. I found that bhist does the job, but the short form bhist output does not show all of the fields I need.
What I want to know is:
Is bhist's output field customizable? The fields I need are:
<jobid>
<user>
<queue>
<job_name>
<project_name>
<job_description>
<submission_time>
<pending_time>
<run_time>
If 1 is not possible, the long form (bhist -l) output shows everything I need, but the format is hard to manipulate. I've pasted an example of the format below.
For example, the number of line between records is not fixed, and the word wrap in each event may break the line in the middle of a word I'm trying to scan for. How do I parse this format with sed and awk?
JobId <1531>, User <user1>, Project <default>, Command< example200>
Fri Dec 27 13:04:14: Submitted from host <hostA> to Queue <priority>, CWD <$H
OME>, Specified Hosts <hostD>;
Fri Dec 27 13:04:19: Dispatched to <hostD>;
Fri Dec 27 13:04:19: Starting (Pid 8920);
Fri Dec 27 13:04:20: Running with execution home </home/user1>, Execution CWD
</home/user1>, Execution Pid <8920>;
Fri Dec 27 13:05:49: Suspended by the user or administrator;
Fri Dec 27 13:05:56: Suspended: Waiting for re-scheduling after being resumed
by user;
Fri Dec 27 13:05:57: Running;
Fri Dec 27 13:07:52: Done successfully. The CPU time used is 28.3 seconds.
Summary of time in seconds spent in various states by Sat Dec 27 13:07:52 1997
PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL
5 0 205 7 1 0 218
------------------------------------------------------------
.... repeat
I'm adding a second answer because it might help you with your problem without actually having to write your own solution (depending on the usage statistics you're after).
LSF already has a utility called bacct that computes and prints out various usage statistics about historical LSF jobs filtered by various criteria.
For example, to get summary usage statistics about jobs that were dispatched/completed/submitted between time0 and time1, you can use (respectively):
bacct -D time0,time1
bacct -C time0,time1
bacct -S time0,time1
Statistics about jobs submitted by a particular user:
bacct -u <username>
Statistics about jobs submitted to a particular queue:
bacct -q <queuename>
These options can be combined as well, so for example if you wanted statistics about jobs that were submitted and completed within a particular time window for a particular project, you can use:
bacct -S time0,time1 -C time0,time1 -P <projectname>
The output provides some summary information about all jobs that match the provided criteria like so:
$ bacct -u bobbafett -q normal
Accounting information about jobs that are:
- submitted by users bobbafett,
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to queues normal,
- accounted on all service classes.
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 0 Total number of exited jobs: 32
Total CPU time consumed: 46.8 Average CPU time consumed: 1.5
Maximum CPU time of a job: 9.0 Minimum CPU time of a job: 0.0
Total wait time in queues: 18680.0
Average wait time in queue: 583.8
Maximum wait time in queue: 5507.0 Minimum wait time in queue: 0.0
Average turnaround time: 11568 (seconds/job)
Maximum turnaround time: 43294 Minimum turnaround time: 40
Average hog factor of a job: 0.00 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.02 Minimum hog factor of a job: 0.00
Total Run time consumed: 351504 Average Run time consumed: 10984
Maximum Run time of a job: 1844674 Minimum Run time of a job: 0
Total throughput: 0.24 (jobs/hour) during 160.32 hours
Beginning time: Nov 11 17:55 Ending time: Nov 18 10:14
This command also has a long form output that provides some bhist -l-like information about each job that might be a bit easier to parse (although still not all that easy):
$ bacct -l -u bobbafett -q normal
Accounting information about jobs that are:
- submitted by users bobbafett,
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to queues normal,
- accounted on all service classes.
------------------------------------------------------------------------------
Job <101>, User <bobbafett>, Project <default>, Status <EXIT>, Queue <normal>,
Command <sleep 100000000>
Wed Nov 11 17:37:45: Submitted from host <endor>, CWD <$HOME>;
Wed Nov 11 17:55:05: Completed <exit>; TERM_OWNER: job killed by owner.
Accounting information about this job:
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.00 1040 1040 exit 0.0000 0M 0M
------------------------------------------------------------------------------
...
Long form output is pretty hard to parse. I know bjobs has an option for unformatted output (-UF) in older LSF versions which makes it a bit easier, and the most recent version of LSF allows you to customize which columns get printed in short form output with -o.
Unfortunately, neither of these options are available with bhist. The only real possibilities for historical information are:
Figure out some way to parse bhist -l -- impractical and maybe not even possible due to inconsistent formatting as you've discovered.
Write a C program to do what you want using the LSF API, which exposes the functions that bhist itself uses to parse the lsb.events file. This is the file that stores all the historical information about the LSF cluster, and is what bhist reads to generate its ouptut.
If C is not an option for you, then you could try writing a script to parse the lsb.events file directly -- the format is documented in the configuration reference. This is hard, but not impossible. Here is the relevant document for LSF 9.1.3.
My personal recommendation would be #2 -- the function you're looking for is lsb_geteventrec(). You'd basically read each line in lsb.events one at a time and pull out the information you need.
Here is the simple perl script fetching data from SQL.
Read data and write on a file OUTFILE, and print the data on screen for every 10000th line.
One thing I am curious is that the printing the data on screen terminates very quickly(in 30 seconds), however, data fetching and writing on a file ends very slowly(30 minutes later).
The amount of data is not large. The output files size is less than 100Mbyte.
while ( my ($a,$b) = $curSqlEid->fetchrow_array() )
{
printf OUTFILE ("%s,%d\n", $a,$b);
$counter ++;
if($counter % 10000 == 0){
printf ("%s,%d\n", $a,$b);
}
}
$curSqlEid->finish();
$dbh->disconnect();
close(OUTFILE);
You are suffering from buffering.
Handles other than STDERR are buffered by default, and most handles use a block buffering. That means Perl will wait until there is 8KB* of data to write before sending anything to the system.
STDOUT is special. When is attached to a terminal (and only then), it uses a different kind of buffering: line buffering. When using line buffering, the data is flushed every time a newline is encountered in the data to write.
You can see this by running
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;'
[ 5 seconds pass ]
abcdef
[ 5 seconds pass ]
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;' | cat
[ 10 seconds pass ]
abcdef
The solution is to turn off buffering.
use IO::Handle qw( ); # Not needed on Perl 5.14 or later
OUTFILE->autoflush(1);
* — 8KB is the default. It can be configured when Perl is compiled. It used to be a non-configurable 4KB until 5.14.
I think you are seeing the output file size as 0 while the script is running and displaying on the console. Do not go by that. The file size will show up only once the script has finished. This is due to output buffering.
Anyways, the delay cannot be as large as 30 min. Once the script is done, you should see the output file data.
I tried various things, but the final conclusion is that python and perl has basically different handling data flow from DB. It looks like in perl, it is possible to handle data line by line while the data is transferred from DB. However, in Python it needs to wait until the entire data download from the server to process it.
I've written a munin plugin that uses slurm's sacct to monitor job states on a HPC cluster. I've written it in sh + awk (rather than my usual tool of choice, perl).
The script works, but it took me ages to figure out how to pre-populate the associative array of possible states (some/most may not be present in sacct output, and i want them to default to zero). Google wasn't much help, and the best I could come up with was to use split on a string to produce a temporary array, which I then iterated over.
I came up with this:
BEGIN {
num = split("cancelled completed completing failed nodefail pending running suspended timeout",statenames," ");
for (i=1;i<=num;i++) {
states[statenames[i]] = 0
}
}
This works, but seems clumsy compared to how i'd do it in perl, like this:
foreach (qw(cancelled completed completing failed nodefail pending running suspended timeout)) {
$states{$_} = 0;
}
or this
%states = map {$_ => 0} qw(cancelled completed completing failed nodefail pending running suspended timeout);
my question is: is there a way of doing this in awk that is similar to either of the perl versions?
[ edited ]
to clarify, here's a sample of the sacct output i'm piping into awk. Note that the only states in this output are RUNNING, COMPLETED, and CANCELLED - the others don't exist (because they haven't occurred today), but i want them in my script's output anyway (in a form usable by munin as "statename.value 0").
# sacct -X -P -o 'state' -n
RUNNING
RUNNING
RUNNING
RUNNING
COMPLETED
RUNNING
COMPLETED
RUNNING
COMPLETED
COMPLETED
CANCELLED by 1000
COMPLETED
[ edited again ]
and here's sample output from my munin plugin:
# ./slurm-sacct
suspended.value 0
pending.value 0
nodefail.value 0
failed.value 0
running.value 6
completing.value 0
completed.value 5
timeout.value 0
cancelled.value 1
The script runs and does what I want, I just wanted to know if there was a better way to initialise the associative array.
You probably don't need to do it at all. Variables in awk are dynamic, which means they're automatically initialized when they are first used (either assigned to or accessed), and this applies to array elements as well.
A variable will be initialized to 0 if it's accessed in a numeric context, or to the empty string otherwise. (At least gawk does this, though I'm not sure if it's implementation-dependent) So if you're doing something like counting the number of jobs that are in each state, the entire program is as simple as something like
{ states[$1]++ }
END {
for (state in states) print state, states[state]
}
Each time the expression states[$1]++ is executed, it will check for the existence of states[$1] and initialize it to 0 if it doesn't already exist.
EDIT: From your comment I'm guessing you want to print out a line for each possible state, regardless of whether there are any jobs in that state or not. In that case, you need to include all the possible state names, and there is no shortcut notation for doing so as there is in Perl. As far as I know, what you've already found is about as clean as it gets. (Awk is not really designed with that usage in mind)
I'd suggest the following:
{ states[$1]++ }
END {
split("cancelled completed completing failed nodefail pending running suspended timeout",statenames," ");
for (state in statenames) print state, states[state]+0
}
Perhaps Craig can use instead of :
print "Timeout states ",states[timeout],".";
this:
print "Timeout states ",int(states[timeout]),".";
In my case if there is no timeout state in awk input, the first print will give:
Timeout states .
While the second will give:
Timeout states 0.
I think a more natural approach in awk would be to have a separate file of keys. Consider a file keys.txt with one key per line. You could then do something like this:
printf "key1\nkey2\nkey2\nkey5" |
awk '
FILENAME == "keys.txt" {
counts[$0] = 0
next
}
{
counts[$0]++
}
END {
for (key in counts) {
print key, counts[key]
}
}' keys.txt -
With five keys in keys.txt, this produces:
key1 1
key2 2
key3 0
key4 0
key5 1
Although the keys are shown in order here, that's just incidental and shouldn't be relied upon.
For the specific example, you could also skip the associative array altogether. Instead, you could minimally process the lines with awk and use sort | uniq -c to tabulate the counts. The presence of all keys could be ensured using join against a file of keys.
awk is somewhat clumsier (I would say "less terse") than Perl.
You could write this (similar to #Michael's answer):
pipeline of data |
awk '
NR == FNR {statenames[$1]=0; next}
{ usual processing }
END { usual output }
' <(printf "%s\n" cancelled completed completing failed nodefail pending running suspended timeout) -
One tweak to #DavidZaslavsky's answer might be to print the states in the order you specified them on the split() line. That would be:
{ states[tolower($1)]++ }
END {
n = split("cancelled completed completing failed nodefail pending running suspended timeout",statenames)
for (i=1; i<=n; i++) {
state = statenames[i]
print state, states[state]+0
}
}
I also converted the input to lower case so it matches your hard-coded values, got rid of the unnecessary 3rd arg to split() and the subsequent null statement (trailing semi-colon).
In case you want to account for finding state names in your input that weren't in your hard-coded set, you could tweak it to:
{ states[tolower($1)]++ }
END {
n = split("cancelled completed completing failed nodefail pending running suspended timeout",statenames)
for (i=1; i<=n; i++) {
state = statenames[i]
print state, states[state]+0
delete states[state]
}
for (state in states) {
print "WARNING: found new state name %s\n",state | "cat>&2"
print state, states[state]+0
}
}