Printf formatting a variable without forking? - printf

For my powerlevel10k custom prompt, I currently have this function to display the seconds since the epoch, comma separated. I display it under the current time so I always have a cue to remember roughly what the current epoch time is.
function prompt_epoch() {
MYEPOCH=$(/bin/date +%s | sed ':a;s/\B[0-9]\{3\}\>/,&/;ta')
p10k segment -f 66 -t ${MYEPOCH}
}
My prompt looks like this: https://imgur.com/0IT5zXi
I've been told I can do this without the forked processes using these commands:
$ zmodload -F zsh/datetime p:EPOCHSECONDS
$ printf "%'d" $EPOCHSECONDS
1,648,943,504
But I'm not sure how to do that without the forking. I know to add the zmodload line in my ~/.zshrc before my powerlevel10k is sourced, but formatting ${EPOCHSECONDS} isn't something I know how to do without a fork.
If I were doing it the way I know, this is what I'd do:
function prompt_epoch() {
MYEPOCH=$(printf "%'d" ${EPOCHSECONDS})
p10k segment -f 66 -t ${MYEPOCH}
}
But as far as I understand it, that's still forking a process every time the prompt is called, correct? Am I misunderstanding the advice given because I don't think I can see a way to get the latest epoch seconds without running some sort of process, which requires a fork, correct?

The printf zsh builtin can assign the value to a variable using the -v flag. Therefore my function can be rewritten as:
function prompt_epoch() {
printf -v MYEPOCH "%'d" ${EPOCHSECONDS}
p10k segment -f 66 -t ${MYEPOCH}
}
Thanks to this answer in Unix Stackoverflow: https://unix.stackexchange.com/a/697807/101884

Related

issue with a modification of youtube-dl in .zshrc

the code I have in my .zshrc is:
ytdcd () { #youtube-dl that automatically puts stuff in a specific folder and returns to the former working directory after.
cd ~/youtube/new/ && {
youtube-dl "$#"
cd - > /dev/null
}
}
ytd() { #sofar, this function can only take one page. so, i can only send one youttube video code per line. will modify it to accept multiple lines..
for i in $*;
do
params=" $params https://youtu.be/$i"
done
ytdcd -f 18 $params
}
so, on the commandline (terminal), when i enter ytd DFreHo3UCD0, i would like to have the video at https://youtu.be/DFreHo3UCD0 to be downloaded. the problem is that when I enter the command in succession, the system just tries to download the video for the previous command and rightly claims the download is complete.
For example, entering:
> ytd DFreHo3UCD0
> ytd L3my9luehfU
would not attempt to download the video for L3my9luehfU but only the video for DFreHo3UCD0 twice.
First -- there's no point to returning to the old directory for ytdcd: You can change to a new directory only inside a subshell, and then exec youtube-dl to replace that subshell with the application process:
This has fewer things to go wrong: Aborting the function's execution can't leave things in the wrong directory, because the parent shell (the one you're interactively using) never changed directories in the first place.
ytdcd () {
(cd ~/youtube/new/ && exec youtube-dl "$#")
}
Second -- use an array when building argument lists, not a string.
If you use set -x to log its execution, you'll see that your original command runs something like:
ytdcd -f 18 'https://youtu.be/one https://youtu.be/two https://youtu.be/three'
See those quotes? That's because $params is a string, passed as a single argument, not an array. (In bash -- or another shell following POSIX rules -- an unquoted string expansion would be string-split and glob-expanded, but zsh doesn't follow POSIX rules).
The following builds up an array of separate arguments and passes them individually:
ytd() {
local -a params=( )
local i
for i; do
params+=( "https://youtu.be/$i" )
done
ytdcd -f 18 "${params[#]}"
}
Finally, it's come up that you don't actually intend to pass all the URLs to just one youtube-dl instance. To run a separate instance per URL, use:
ytd() {
local i retval=0
for i; do
ytdcd -f 18 "$i" || retval=$?
done
return "$retval"
}
Note here that we're capturing non-success exit status, so as not to hide an error in any ytdcd instance other than the last (which would otherwise occur).
I would declare param as local, so that you are not appending url after urls...
You can try to add this awesome function to your .zshrc:
funfun() {
local _fun1="$_fun1 fun1!"
_fun2="$_fun2 fun2!"
echo "1 says: $_fun1"
echo "2 says: $_fun2"
}
To observe the thing ;)
EDIT (Explanation):
When sourcing shell script, you add it to you current environment, that is why you can run those function you define. So, when those function use variables, by default, those variable will be global and accessible from anywhere in your environment! Therefore, In this case param is defined globally for all the length of your shell session. Since you want to allow the download of several video at once, you are appending values to this global variable, which will grow all the time.
Enforcing local tells zsh to limit the scope of params to the function only.
Another solution is to reset the variable when you call the function.

How can i view all comments posted by users in bitbucket repository

In the repository home page , i can see comments posted in recent activity at the bottom, bit it only shows 10 commnets.
i want to all the comments posted since beginning.
Is there any way
Comments of pull requests, issues and commits can be retrieved using bitbucket’s REST API.
However it seems that there is no way to list all of them at one place, so the only way to get them would be to query the API for each PR, issue or commit of the repository.
Note that this takes a long time, since bitbucket has seemingly set a limit to the number of accesses via API to repository data: I got Rate limit for this resource has been exceeded errors after retrieving around a thousand results, then I could retrieve about only one entry per second elapsed from the time of the last rate limit error.
Finding the API URL to the repository
The first step is to find the URL to the repo. For private repositories, it is necessary to get authenticated by providing username and password (using curl’s -u switch). The URL is of the form:
https://api.bitbucket.org/2.0/repositories/{repoOwnerName}/{repoName}
Running git remote -v from the local git repository should provide the missing values. Check the forged URL (below referred to as $url) by verifying that repository information is correctly retrieved as JSON data from it: curl -u username $url.
Fetching comments of commits
Comments of a commit can be accessed at $url/commit/{commitHash}/comments.
The resulting JSON data can be processed by a script. Beware that the results are paginated.
Below I simply extract the number of comments per commit. It is indicated by the value of the member size of the retrieved JSON object; I also request a partial response by adding the GET parameter fields=size.
My script getNComments.sh:
#!/bin/sh
pw=$1
id=$2
json=$(curl -s -u username:"$pw" \
https://api.bitbucket.org/2.0/repositories/{repoOwnerName}/{repoName}/commit/$id/comments'?fields=size')
printf '%s' "$json" | grep -q '"type": "error"' \
&& printf "ERROR $id\n" && exit 0
nComments=$(printf '%s' "$json" | grep -o '"size": [0-9]*' | cut -d' ' -f2)
: ${nComments:=EMPTY}
checkNumeric=$(printf '%s' "$nComments" | tr -dc 0-9)
[ "$nComments" != "$checkNumeric" ] \
&& printf >&2 "!ERROR! $id:\n%s\n" "$json" && exit 1
printf "$nComments $id\n"
To use it, taking into account the possibility for the error mentioned above:
A) Prepare input data. From the local repository, generate the list of commits as wanted (run git fetch -a prior to update the local git repo if needed); check out git help rev-list for how it can be customised.
git rev-list --all | sort > sorted-all.id
cp sorted-all.id remaining.id
B) Run the script. Note that the password is passed here as a parameter – so first assign it to a variable safely using stty -echo; IFS= read -r passwd; stty echo, in one line; also see security considerations below. The processing is parallelised onto 15 processes here, using the option -P.
< remaining.id xargs -P 15 -L 1 ./getNComments.sh "$passwd" > commits.temp
C) When the rate limit is reached, that is when getNComments.sh prints !ERROR!, then kill the above command (Ctrl-C), and execute these below to update the input and output files. Wait a while for the request limit to increase, then re-execute the above one command and repeat until all the data is processed (that is when wc -l remaining.id returns 0).
cat commits.temp >> commits.result
cut -d' ' -f2 commits.result | sort | comm -13 - sorted-all.id > remaining.id
D) Finally, you can get the commits which received comments with:
grep '^[1-9]' commits.result
Fetching comments of pull requests and issues
The procedure is the same as for fetching commits’ comments, but for the following two adjustments:
Edit the script to replace in the URL commit by pullrequests or by issues, as appropriate;
Let $n be the number of issues/PRs to search. The git rev-list command above becomes: seq 1 $n > sorted-all.id
The total number of PRs in the repository can be obtained with:
curl -su username $url/pullrequests'?state=&fields=size'
and, if the issue tracker is set up, the number of issues with:
curl -su username $url/issues'?fields=size'
Hopefully, the repository has few enough PRs and issues so that all data can be fetched in one go.
Viewing comments
They can be viewed normally via the web interface on their commit/PR/issue page at:
https://bitbucket.org/{repoOwnerName}/{repoName}/commits/{commitHash}
https://bitbucket.org/{repoOwnerName}/{repoName}/pull-requests/{prId}
https://bitbucket.org/{repoOwnerName}/{repoName}/issues/{issueId}
For example, to open all PRs with comments in firefox:
awk '/^[1-9]/{print "https://bitbucket.org/{repoOwnerName}/{repoName}/pull-requests/"$2}' PRs.result | xargs firefox
Security considerations
Arguments passed on the command line are visible to all users of the system, via ps ax (or /proc/$PID/cmdline). Therefore the bitbucket password will be exposed, which could be a concern if the system is shared by multiple users.
There are three commands getting the password from the command line: xargs, the script, and curl.
It appears that curl tries to hide the password by overwriting its memory, but it is not guaranteed to work, and even if it does, it leaves it visible for a (very short) time after the process starts. On my system, the parameters to curl are not hidden.
A better option could be to pass the sensitive information through environment variables. They should be visible only to the current user and root via ps axe (or /proc/$PID/environ); although it seems that there are systems that let all users access this information (do a ls -l /proc/*/environ to check the environment files’ permissions).
In the script simply replace the lines pw=$1 id=$2 with id=$1, then pass pw="$passwd" before xargs in the command line invocation. It will make the environment variable pw visible to xargs and all of its descendent processes, that is the script and its children (curl, grep, cut, etc), which may or may not read the variable. curl does not read the password from the environment, but if its password hiding trick mentioned above works then it might be good enough.
There are ways to avoid passing the password to curl via the command line, notably via standard input using the option -K -. In the script, replace curl -s -u username:"$pw" with printf -- '-s\n-u "%s"\n' "$authinfo" | curl -K - and define the variable authinfo to contain the data in the format username:password. Note that this method needs printf to be a shell built-in to be safe (check with type printf), otherwise the password will show up in its process arguments. If it is not a built-in, try with print or echo instead.
A simple alternative to an environment variable that will not appear in ps output in any case is via a file. Create a file with read/write permissions restricted to the current user (chmod 600), and edit it so that it contains username:password as its first line. In the script, replace pw=$1 with IFS= read -r authinfo < "$1", and edit it to use curl’s -K option as in the paragraph above. In the command line invocation replace $passwd with the filename.
The file approach has the drawback that the password will be written to disk (note that files in /proc are not on the disk). If this too is undesirable, it is possible to pass a named pipe instead of a regular file:
mkfifo pipe
chmod 600 pipe
# make sure printf is a builtin, or use an equivalent instead
(while :; do printf -- '%s\n' "username:$passwd"; done) > pipe&
pid=$!
exec 3<pipe
Then invoke the script passing pipe instead of the file. Finally, to clean up do:
kill $pid
exec 3<&-
This will ensure the authentication info is passed directly from the shell to the script (through the kernel), is not written to disk and is not exposed to other users via ps.
You can go to Commits and see the top line for each commit, you will need to click on each one to see further information.
If I find a way to see all without drilling into each commit, I will update this answer.

Expect script does not work under crontab

I have an expect script which I need to run every 3 mins on my management node to collect tx/rx values for each port attached to DCX Brocade SAN Switch using the command #portperfshow#
Each time I try to use crontab to execute the script every 3 mins, the script does not work!
My expect script starts with #!/usr/bin/expect -f and I am calling the script using the following syntax under cron:
3 * * * * /usr/bin/expect -f /root/portsperfDCX1/collect-all.exp sanswitchhostname
However, when I execute the script (not under cron) it works as expected:
root# ./collect-all.exp sanswitchhostname
works just fine.
Please Please can someone help! Thanks.
The script collect-all.exp is:
#!/usr/bin/expect -f
#Time and Date
set day [timestamp -format %d%m%y]
set time [timestamp -format %H%M]
#logging
set LogDir1 "/FPerf/PortsLogs"
et timeout 5
set ipaddr [lrange $argv 0 0]
set passw "XXXXXXX"
if { $ipaddr == "" } {
puts "Usage: <script.exp> <ip address>\n"
exit 1
}
spawn ssh admin#$ipaddr
expect -re "password"
send "$passw\r"
expect -re "admin"
log_file "$LogDir1/$day-portsperfshow-$time"
send "portperfshow -tx -rx -t 10\r"
expect timeout "\n"
send \003
log_file
send -- "exit\r"
close
I had the same issue, except that my script was ending with
interact
Finally I got it working by replacing it with these two lines:
expect eof
exit
Changing interact to expect eof worked for me!
Needed to remove the exit part, because I had more statements in the bash script after the expect line (calling expect inside a bash script).
There are two key differences between a program that is run normally from a shell and a program that is run from cron:
Cron does not populate (many) environment variables. Notably absent are TERM, SHELL and HOME, but that's just a small proportion of the long list that will be not defined.
Cron does not set up a current terminal, so /dev/tty doesn't resolve to anything. (Note, programs spawned by Expect will have a current terminal.)
With high probability, any difficulties will come from these, especially the first. To fix, you need to save all your environment variables in an interactive session and use these in your expect script to repopulate the environment. The easiest way is to use this little expect script:
unset -nocomplain ::env(SSH_AUTH_SOCK) ;# This one is session-bound anyway
puts [list array set ::env [array get ::env]]
That will write out a single very long line which you want to put near the top of your script (or at least before the first spawn). Then see if that works.
Jobs run by cron are not considered login shells, and thus don't source your .bashrc, .bash_profile, etc.
If you want that behavior, you need to add it explicitly to the crontab entry like so:
$ crontab -l
0 13 * * * bash -c '. .bash_profile; etc ...'
$

alternative to tail -F

I am monitoring a log file by doing "TAIL -n -0 -F filename". But this is taking up a lof of CPU as there are many messages being written to the logfile. Is there a way, I can open a file and read new/few entries and close it and repeat it every 5 second interval? So that I don't need to keep following the file? How can I remember the last read line to start from the next one in the next run? I am trying to do this in nawk by spawning tail shell cmd.
You won't be able to magically use less resources to tail a file by writing your own implementation. If tail -f is using resources because the file is growing fast, a custom version won't help any if you still want to view all lines as they are being written. You are simply limited by your hardware I/O and/or CPU.
Try using --sleep-interval=S where "S" is a number of seconds (the default is 1.0 - you can specify decimals).
tail -n 0 --sleep-interval=.5 -F filename
If you have so many log entries that tail is bogging down the CPU, how are you able to monitor them?

Processing apache logs quickly

I'm currently running an awk script to process a large (8.1GB) access-log file, and it's taking forever to finish. In 20 minutes, it wrote 14MB of the (1000 +- 500)MB I expect it to write, and I wonder if I can process it much faster somehow.
Here is the awk script:
#!/bin/bash
awk '{t=$4" "$5; gsub("[\[\]\/]"," ",t); sub(":"," ",t);printf("%s,",$1);system("date -d \""t"\" +%s");}' $1
EDIT:
For non-awkers, the script reads each line, gets the date information, modifies it to a format the utility date recognizes and calls it to represent the date as the number of seconds since 1970, finally returning it as a line of a .csv file, along with the IP.
Example input: 189.5.56.113 - - [22/Jan/2010:05:54:55 +0100] "GET (...)"
Returned output: 189.5.56.113,124237889
#OP, your script is slow mainly due to the excessive call of system date command for every line in the file, and its a big file as well (in the GB). If you have gawk, use its internal mktime() command to do the date to epoch seconds conversion
awk 'BEGIN{
m=split("Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec",d,"|")
for(o=1;o<=m;o++){
date[d[o]]=sprintf("%02d",o)
}
}
{
gsub(/\[/,"",$4); gsub(":","/",$4); gsub(/\]/,"",$5)
n=split($4, DATE,"/")
day=DATE[1]
mth=DATE[2]
year=DATE[3]
hr=DATE[4]
min=DATE[5]
sec=DATE[6]
MKTIME= mktime(year" "date[mth]" "day" "hr" "min" "sec)
print $1,MKTIME
}' file
output
$ more file
189.5.56.113 - - [22/Jan/2010:05:54:55 +0100] "GET (...)"
$ ./shell.sh
189.5.56.113 1264110895
If you really really need it to be faster, you can do what I did. I rewrote an Apache log file analyzer using Ragel. Ragel allows you to mix regular expressions with C code. The regular expressions get transformed into very efficient C code and then compiled. Unfortunately, this requires that you are very comfortable writing code in C. I no longer have this analyzer. It processed 1 GB of Apache access logs in 1 or 2 seconds.
You may have limited success removing unnecessary printfs from your awk statement and replacing them with something simpler.
If you are using gawk, you can massage your date and time into a format that mktime (a gawk function) understands. It will give you the same timestamp you're using now and save you the overhead of repeated system() calls.
This little Python script handles a ~400MB worth of copies of your example line in about 3 minutes on my machine producing ~200MB of output (keep in mind your sample line was quite short, so that's a handicap):
import time
src = open('x.log', 'r')
dest = open('x.csv', 'w')
for line in src:
ip = line[:line.index(' ')]
date = line[line.index('[') + 1:line.index(']') - 6]
t = time.mktime(time.strptime(date, '%d/%b/%Y:%X'))
dest.write(ip)
dest.write(',')
dest.write(str(int(t)))
dest.write('\n')
src.close()
dest.close()
A minor problem is that it doesn't handle timezones (strptime() problem), but you could either hardcode that or add a little extra to take care of it.
But to be honest, something as simple as that should be just as easy to rewrite in C.
gawk '{
dt=substr($4,2,11);
gsub(/\//," ",dt);
"date -d \""dt"\" +%s"|getline ts;
print $1, ts
}' yourfile