Asynchronous reading of an stdout - raku

I've wrote this simple script, it generates one output line per second (generator.sh):
for i in {0..5}; do echo $i; sleep 1; done
The raku program will launch this script and will print the lines as soon as they appear:
my $proc = Proc::Async.new("sh", "generator.sh");
$proc.stdout.tap({ .print });
my $promise = $proc.start;
await $promise;
All works as expected: every second we see a new line. But let's rewrite generator in raku (generator.raku):
for 0..5 { .say; sleep 1 }
and change the first line of the program to this:
my $proc = Proc::Async.new("raku", "generator.raku");
Now something wrong: first we see first line of output ("0"), then a long pause, and finally we see all the remaining lines of the output.
I tried to grab output of the generators via script command:
script -c 'sh generator.sh' script-sh
script -c 'raku generator.raku' script-raku
And to analyze them in a hexadecimal editor, and it looks like they are the same: after each digit, bytes 0d and 0a follow.
Why is such a difference in working with seemingly identical generators? I need to understand this because I am going to launch an external program and process its output online.

Why is such a difference in working with seemingly identical generators?
First, with regard to the title, the issue is not about the reading side, but rather the writing side.
Raku's I/O implementation looks at whether STDOUT is attached to a TTY. If it is a TTY, any output is immediately written to the output handle. However, if it's not a TTY, then it will apply buffering, which results in a significant performance improvement but at the cost of the output being chunked by the buffer size.
If you change generator.raku to disable output buffering:
$*OUT.out-buffer = False; for 0..5 { .say; sleep 1 }
Then the output will be seen immediately.
I need to understand this because I am going to launch an external program and process its output online.
It'll only be an issue if the external program you launch also has such a buffering policy.

In addition to answer of #Jonathan Worthington. Although buffering is an issue of writing side, it is possible to cope with this on the reading side. stdbuf, unbuffer, script can be used on linux (see this discussion). On windows only winpty helps me, which I found here.
So, if there are winpty.exe, winpty-agent.exe, winpty.dll, msys-2.0.dll files in working directory, this code can be used to run program without buffering:
my $proc = Proc::Async.new(<winpty.exe -Xallow-non-tty -Xplain raku generator.raku>);

Related

Is it possible to enable exit on error behavior in an interactive Tcl shell?

I need to automate a huge interactive Tcl program using Tcl expect.
As I realized, this territory is really dangerous, as I need to extend the already existing mass of code, but I can't rely on errors actually causing the program to fail with a positive exit code as I could in a regular script.
This means I have to think about every possible thing that could go wrong and "expect" it.
What I currently do is use a "die" procedure instead of raising an error in my own code, that automatically exits. But this kind of error condition can not be catched, and makes it hard to detect errors especially in code not written by me, since ultimately, most library routines will be error-based.
Since I have access to the program's Tcl shell, is it possible to enable fail-on-error?
EDIT:
I am using Tcl 8.3, which is a severe limitation in terms of available tools.
Examples of errors I'd like to automatically exit on:
% puts $a(2)
can't read "a(2)": no such element in array
while evaluating {puts $a(2)}
%
% blublabla
invalid command name "blublabla"
while evaluating blublabla
%
As well as any other error that makes a normal script terminate.
These can bubble up from 10 levels deep within procedure calls.
I also tried redefining the global error command, but not all errors that can occur in Tcl use it. For instance, the above "command not found" error did not go through my custom error procedure.
Since I have access to the program's Tcl shell, is it possible to
enable fail-on-error?
Let me try to summarize in my words: You want to exit from an interactive Tcl shell upon error, rather than having the prompt offered again?
Update
I am using Tcl 8.3, which is a severe limitation in terms of available
tools [...] only source patches to the C code.
As you seem to be deep down in that rabbit hole, why not add another source patch?
--- tclMain.c 2002-03-26 03:26:58.000000000 +0100
+++ tclMain.c.mrcalvin 2019-10-23 22:49:14.000000000 +0200
## -328,6 +328,7 ##
Tcl_WriteObj(errChannel, Tcl_GetObjResult(interp));
Tcl_WriteChars(errChannel, "\n", 1);
}
+ Tcl_Exit(1);
} else if (tsdPtr->tty) {
resultPtr = Tcl_GetObjResult(interp);
Tcl_GetStringFromObj(resultPtr, &length);
This is untested, the Tcl 8.3.5 sources don't compile for me. But this section of Tcl's internal are comparable to current sources, tested using my Tcl 8.6 source installation.
For the records
With a stock shell (tclsh), this is a little fiddly, I am afraid. The following might work for you (though, I can imagine cases where this might fail you). The idea is
to intercept writes to stderr (this is to where an interactive shell redirects error messages, before returning to the prompt).
to discriminate between arbitrary writes to stderr and error cases, one can use the global variable ::errorInfo as a sentinel.
Step 1: Define a channel interceptor
oo::class create Bouncer {
method initialize {handle mode} {
if {$mode ne "write"} {error "can't handle reading"}
return {finalize initialize write}
}
method finalize {handle} {
# NOOP
}
method write {handle bytes} {
if {[info exists ::errorInfo]} {
# This is an actual error;
# 1) Print the message (as usual), but to stdout
fconfigure stdout -translation binary
puts stdout $bytes
# 2) Call on [exit] to quit the Tcl process
exit 1
} else {
# Non-error write to stderr, proceed as usual
return $bytes
}
}
}
Step 2: Register the interceptor for stderr in interactive shells
if {[info exists ::tcl_interactive]} {
chan push stderr [Bouncer new]
}
Once registered, this will make your interactive shell behave like so:
% puts stderr "Goes, as usual!"
Goes, as usual!
% error "Bye, bye"
Bye, bye
Some remarks
You need to be careful about the Bouncer's write method, the error message has already been massaged for the character encoding (therefore, the fconfigure call).
You might want to put this into a Tcl package or Tcl module, to load the bouncer using package req.
I could imagine that your program writes to stderr and the errorInfo variable happens to be set (as a left-over), this will trigger an unintended exit.

Read all lines at the same time individually - Solaris ksh

I need some help with a script. Solaris 10 and ksh.
I Have a file called /temp.list with this content:
192.168.0.1
192.168.0.2
192.168.0.3
So, I have a script which reads this list and executes some commands using the lines values:
FILE_TMP="/temp.list"
while IFS= read line
do
ping $line
done < "$FILE_TMP"
It works, but it executes the command on line 1. When it's over, it goes to the line 2, and it goes successively until the end. I would like to find a way to execute the command ping at the same time in each line of the list. Is there a way to do it?
Thank you in advance!
Marcus Quintella
As Ari's suggested, googling ksh multithreading will produce a lot of ideas/solutions.
A simple example:
FILE_TMP="/temp.list"
while IFS= read line
do
ping $line &
done < "$FILE_TMP"
The trailing '&' says to kick the ping command off in the background, allowing loop processing to continue while the ping command is running in the background.
'course, this is just the tip of the proverbial iceberg as you now need to consider:
multiple ping commands are going to be dumping output to stdout (ie, you're going to get a mish-mash of ping output in your console), so you'll need to give some thought as to what to do with multiple streams of output (eg, redirect to a common file? redirect to separate files?)
you need to have some idea as to how you want to go about managing and (possibly) terminating commands running in the background [ see jobs, ps, fg, bg, kill ]
if running in a shell script you'll likely find yourself wanting to suspend the main shell script processing until all background jobs have completed [ see wait ]

while [[ condition ]] stalls on loop exit

I have a problem with ksh in that a while loop is failing to obey the "while" condition. I should add now that this is ksh88 on my client's Solaris box. (That's a separate problem that can't be addressed in this forum. ;) I have seen Lance's question and some similar but none that I have found seem to address this. (Disclaimer: NO I haven't looked at every ksh question in this forum)
Here's a very cut down piece of code that replicates the problem:
1 #!/usr/bin/ksh
2 #
3 go=1
4 set -x
5 tail -0f loop-test.txt | while [[ $go -eq 1 ]]
6 do
7 read lbuff
8 set $lbuff
9 nwords=$#
10 printf "Line has %d words <%s>\n" $nwords "${lbuff}"
11 if [[ "${lbuff}" = "0" ]]
12 then
13 printf "Line consists of %s; time to absquatulate\n" $lbuff
14 go=0 # Violate the WHILE condition to get out of loop
15 fi
16 done
17 printf "\nLooks like I've fallen out of the loop\n"
18 exit 0
The way I test this is:
Run loop-test.sh in background mode
In a different window I run commands like "echo some nonsense >>loop_test.txt" (w/o the quotes, of course)
When I wish to exit, I type "echo 0 >>loop-test.txt"
What happens? It indeed sets go=0 and displays the line:
Line consists of 0; time to absquatulate
but does not exit the loop. To break out I append one more line to the txt file. The loop does NOT process that line and just falls out of the loop, issuing that "fallen out" message before exiting.
What's going on with this? I don't want to use "break" because in the actual script, the loop is monitoring the log of a database engine and the flag is set when it sees messages that the engine is shutting down. The actual script must still process those final lines before exiting.
Open to ideas, anyone?
Thanks much!
-- J.
OK, that flopped pretty quick. After reading a few other posts, I found an answer given by dogbane that sidesteps my entire pipe-to-while scheme. His is the second answer to a question (from 2013) where I see neeraj is using the same scheme I'm using.
What was wrong? The pipe-to-while has always worked for input that will end, like a file or a command with a distinct end to its output. However, from a tail command, there is no distinct EOF. Hence, the while-in-a-subshell doesn't know when to terminate.
Dogbane's solution: Don't use a pipe. Applying his logic to my situation, the basic loop is:
while read line
do
# put loop body here
done < <(tail -0f ${logfile})
No subshell, no problem.
Caveat about that syntax: There must be a space between the two < operators; otherwise it looks like a HEREIS document with bad syntax.
Er, one more catch: The syntax did not work in ksh, not even in the mksh (under cygwin) which emulates ksh93. But it did work in bash. So my boss is gonna have a good laugh at me, 'cause he knows I dislike bash.
So thanks MUCH, dogbane.
-- J
After articulating the problem and sleeping on it, the reason for the described behavior came to me: After setting go=0, the control flow of the loop still depends on another line of data coming in from STDIN via that pipe.
And now that I have realized the cause of the weirdness, I can speculate on an alternative way of reading from the stream. For the moment I am thinking of the following solution:
Open the input file as STDIN (Need to research the exec syntax for that)
When the condition occurs, close STDIN (Again, need to research the syntax for that)
It should then be safe to use the more intuitive:while read lbuffat the top of the loop.
I'll test this out today and post the result. I'd hope someone else benefit from the method (if it works).

Perl SQL file write delayed

Here is the simple perl script fetching data from SQL.
Read data and write on a file OUTFILE, and print the data on screen for every 10000th line.
One thing I am curious is that the printing the data on screen terminates very quickly(in 30 seconds), however, data fetching and writing on a file ends very slowly(30 minutes later).
The amount of data is not large. The output files size is less than 100Mbyte.
while ( my ($a,$b) = $curSqlEid->fetchrow_array() )
{
printf OUTFILE ("%s,%d\n", $a,$b);
$counter ++;
if($counter % 10000 == 0){
printf ("%s,%d\n", $a,$b);
}
}
$curSqlEid->finish();
$dbh->disconnect();
close(OUTFILE);
You are suffering from buffering.
Handles other than STDERR are buffered by default, and most handles use a block buffering. That means Perl will wait until there is 8KB* of data to write before sending anything to the system.
STDOUT is special. When is attached to a terminal (and only then), it uses a different kind of buffering: line buffering. When using line buffering, the data is flushed every time a newline is encountered in the data to write.
You can see this by running
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;'
[ 5 seconds pass ]
abcdef
[ 5 seconds pass ]
$ perl -e'print "abc"; print "def"; sleep 5; print "\n"; sleep 5;' | cat
[ 10 seconds pass ]
abcdef
The solution is to turn off buffering.
use IO::Handle qw( ); # Not needed on Perl 5.14 or later
OUTFILE->autoflush(1);
* — 8KB is the default. It can be configured when Perl is compiled. It used to be a non-configurable 4KB until 5.14.
I think you are seeing the output file size as 0 while the script is running and displaying on the console. Do not go by that. The file size will show up only once the script has finished. This is due to output buffering.
Anyways, the delay cannot be as large as 30 min. Once the script is done, you should see the output file data.
I tried various things, but the final conclusion is that python and perl has basically different handling data flow from DB. It looks like in perl, it is possible to handle data line by line while the data is transferred from DB. However, in Python it needs to wait until the entire data download from the server to process it.

Nano hacks: most useful tiny programs you've coded or come across

It's the first great virtue of programmers. All of us have, at one time or another automated a task with a bit of throw-away code. Sometimes it takes a couple seconds tapping out a one-liner, sometimes we spend an exorbitant amount of time automating away a two-second task and then never use it again.
What tiny hack have you found useful enough to reuse? To make go so far as to make an alias for?
Note: before answering, please check to make sure it's not already on favourite command-line tricks using BASH or perl/ruby one-liner questions.
i found this on dotfiles.org just today. it's very simple, but clever. i felt stupid for not having thought of it myself.
###
### Handy Extract Program
###
extract () {
if [ -f $1 ] ; then
case $1 in
*.tar.bz2) tar xvjf $1 ;;
*.tar.gz) tar xvzf $1 ;;
*.bz2) bunzip2 $1 ;;
*.rar) unrar x $1 ;;
*.gz) gunzip $1 ;;
*.tar) tar xvf $1 ;;
*.tbz2) tar xvjf $1 ;;
*.tgz) tar xvzf $1 ;;
*.zip) unzip $1 ;;
*.Z) uncompress $1 ;;
*.7z) 7z x $1 ;;
*) echo "'$1' cannot be extracted via >extract<" ;;
esac
else
echo "'$1' is not a valid file"
fi
}
Here's a filter that puts commas in the middle of any large numbers in standard input.
$ cat ~/bin/comma
#!/usr/bin/perl -p
s/(\d{4,})/commify($1)/ge;
sub commify {
local $_ = shift;
1 while s/^([ -+]?\d+)(\d{3})/$1,$2/;
return $_;
}
I usually wind up using it for long output lists of big numbers, and I tire of counting decimal places. Now instead of seeing
-rw-r--r-- 1 alester alester 2244487404 Oct 6 15:38 listdetail.sql
I can run that as ls -l | comma and see
-rw-r--r-- 1 alester alester 2,244,487,404 Oct 6 15:38 listdetail.sql
This script saved my career!
Quite a few years ago, i was working remotely on a client database. I updated a shipment to change its status. But I forgot the where clause.
I'll never forget the feeling in the pit of my stomach when I saw (6834 rows affected). I basically spent the entire night going through event logs and figuring out the proper status on all those shipments. Crap!
So I wrote a script (originally in awk) that would start a transaction for any updates, and check the rows affected before committing. This prevented any surprises.
So now I never do updates from command line without going through a script like this. Here it is (now in Python):
import sys
import subprocess as sp
pgm = "isql"
if len(sys.argv) == 1:
print "Usage: \nsql sql-string [rows-affected]"
sys.exit()
sql_str = sys.argv[1].upper()
max_rows_affected = 3
if len(sys.argv) > 2:
max_rows_affected = int(sys.argv[2])
if sql_str.startswith("UPDATE"):
sql_str = "BEGIN TRANSACTION\\n" + sql_str
p1 = sp.Popen([pgm, sql_str],stdout=sp.PIPE,
shell=True)
(stdout, stderr) = p1.communicate()
print stdout
# example -> (33 rows affected)
affected = stdout.splitlines()[-1]
affected = affected.split()[0].lstrip('(')
num_affected = int(affected)
if num_affected > max_rows_affected:
print "WARNING! ", num_affected,"rows were affected, rolling back..."
sql_str = "ROLLBACK TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
sql_str = "COMMIT TRANSACTION"
ret_code = sp.call([pgm, sql_str], shell=True)
else:
ret_code = sp.call([pgm, sql_str], shell=True)
I use this script under assorted linuxes to check whether a directory copy between machines (or to CD/DVD) worked or whether copying (e.g. ext3 utf8 filenames -> fusebl
k) has mangled special characters in the filenames.
#!/bin/bash
## dsum Do checksums recursively over a directory.
## Typical usage: dsum <directory> > outfile
export LC_ALL=C # Optional - use sort order across different locales
if [ $# != 1 ]; then echo "Usage: ${0/*\//} <directory>" 1>&2; exit; fi
cd $1 1>&2 || exit
#findargs=-follow # Uncomment to follow symbolic links
find . $findargs -type f | sort | xargs -d'\n' cksum
Sorry, don't have the exact code handy, but I coded a regular expression for searching source code in VS.Net that allowed me to search anything not in comments. It came in very useful in a particular project I was working on, where people insisted that commenting out code was good practice, in case you wanted to go back and see what the code used to do.
I have two ruby scripts that I modify regularly to download all of various webcomics. Extremely handy! Note: They require wget, so probably linux. Note2: read these before you try them, they need a little bit of modification for each site.
Date based downloader:
#!/usr/bin/ruby -w
Day = 60 * 60 * 24
Fromat = "hjlsdahjsd/comics/st%Y%m%d.gif"
t = Time.local(2005, 2, 5)
MWF = [1,3,5]
until t == Time.local(2007, 7, 9)
if MWF.include? t.wday
`wget #{t.strftime(Fromat)}`
sleep 3
end
t += Day
end
Or you can use the number based one:
#!/usr/bin/ruby -w
Fromat = "http://fdsafdsa/comics/%08d.gif"
1.upto(986) do |i|
`wget #{sprintf(Fromat, i)}`
sleep 1
end
Instead of having to repeatedly open files in SQL Query Analyser and run them, I found the syntax needed to make a batch file, and could then run 100 at once. Oh the sweet sweet joy! I've used this ever since.
isqlw -S servername -d dbname -E -i F:\blah\whatever.sql -o F:\results.txt
This goes back to my COBOL days but I had two generic COBOL programs, one batch and one online (mainframe folks will know what these are). They were shells of a program that could take any set of parameters and/or files and be run, batch or executed in an IMS test region. I had them set up so that depending on the parameters I could access files, databases(DB2 or IMS DB) and or just manipulate working storage or whatever.
It was great because I could test that date function without guessing or test why there was truncation or why there was a database ABEND. The programs grew in size as time went on to include all sorts of tests and become a staple of the development group. Everyone knew where the code resided and included them in their unit testing as well. Those programs got so large (most of the code were commented out tests) and it was all contributed by people through the years. They saved so much time and settled so many disagreements!
I coded a Perl script to map dependencies, without going into an endless loop, For a legacy C program I inherited .... that also had a diamond dependency problem.
I wrote small program that e-mailed me when I received e-mails from friends, on an rarely used e-mail account.
I wrote another small program that sent me text messages if my home IP changes.
To name a few.
Years ago I built a suite of applications on a custom web application platform in PERL.
One cool feature was to convert SQL query strings into human readable sentences that described what the results were.
The code was relatively short but the end effect was nice.
I've got a little app that you run and it dumps a GUID into the clipboard. You can run it /noui or not. With UI, its a single button that drops a new GUID every time you click it. Without it drops a new one and then exits.
I mostly use it from within VS. I have it as an external app and mapped to a shortcut. I'm writing an app that relies heavily on xaml and guids, so I always find I need to paste a new guid into xaml...
Any time I write a clever list comprehension or use of map/reduce in python. There was one like this:
if reduce(lambda x, c: locks[x] and c, locknames, True):
print "Sub-threads terminated!"
The reason I remember that is that I came up with it myself, then saw the exact same code on somebody else's website. Now-adays it'd probably be done like:
if all(map(lambda z: locks[z], locknames)):
print "ya trik"
I've got 20 or 30 of these things lying around because once I coded up the framework for my standard console app in windows I can pretty much drop in any logic I want, so I got a lot of these little things that solve specific problems.
I guess the ones I'm using a lot right now is a console app that takes stdin and colorizes the output based on xml profiles that match regular expressions to colors. I use it for watching my log files from builds. The other one is a command line launcher so I don't pollute my PATH env var and it would exceed the limit on some systems anyway, namely win2k.
I'm constantly connecting to various linux servers from my own desktop throughout my workday, so I created a few aliases that will launch an xterm on those machines and set the title, background color, and other tweaks:
alias x="xterm" # local
alias xd="ssh -Xf me#development_host xterm -bg aliceblue -ls -sb -bc -geometry 100x30 -title Development"
alias xp="ssh -Xf me#production_host xterm -bg thistle1 ..."
I have a bunch of servers I frequently connect to, as well, but they're all on my local network. This Ruby script prints out the command to create aliases for any machine with ssh open:
#!/usr/bin/env ruby
require 'rubygems'
require 'dnssd'
handle = DNSSD.browse('_ssh._tcp') do |reply|
print "alias #{reply.name}='ssh #{reply.name}.#{reply.domain}';"
end
sleep 1
handle.stop
Use it like this in your .bash_profile:
eval `ruby ~/.alias_shares`