Raku-native disk space usage - raku

Purpose:
Save a program that writes data to disk from vain attempts of writing to a full filesystem;
Save bandwidth (don't download if nowhere to store);
Save user's and programmer's time and nerves (notify them of the problem instead of having them tearing out their hair with reading misleading error messages and "why the heck this software is not working!").
The question comes in 2 parts:
Reporting storage space statistics (available, used, total etc.), either of all filesystems or of the filesystem that path in question belongs to.
Reporting a filesystem error on running out of space.
Part 1
Share please NATIVE Raku alternative(s) (TIMTOWTDIBSCINABTE "Tim Toady Bicarbonate") to:
raku -e 'qqx{ df -P $*CWD }.print'
Here, raku -executes df (disk free) external program via shell quoting with interpolation qqx{}, feeding -Portable-format argument and $*CWD Current Working Directory, then .prints the df's output.
The snippet initially had been written as raku -e 'qqx{ df -hP $*CWD }.print' — with both -human-readable and -Portable — but it turned out that it is not a ubiquitously valid command. In OpenBSD 7.0, it exits with an error: df: -h and -i are incompatible with -P.
For adding human-readability, you may consider Number::Bytes::Human module

raku -e 'run <<df -hP $*CWD>>'
If you're just outputting what df gives you on STDOUT, you don't need to do anything.
The << >> are double quoting words, so that the $*CWD will be interpolated.

Part 1 — Reporting storage space statistics
There's no built in function for reporting storage space statistics. Options include:
Write Raku code (a few lines) that uses NativeCall to invoke a platform / filesystem specific system call (such as statvfs()) and uses the information returned by that call.
Use a suitable Raku library. FileSystem::Capacity picks and runs an external program for you, and then makes its resulting data available in a portable form.
Use run (or similar1) to invoke a specific external program such as df.
Use an Inline::* foreign language adaptor to enable invoking of a foreign PL's solution for reporting storage space statistics, and use the info it provides.2
Part 2 — Reporting running out of space
Raku seems to neatly report No space left on device:
> spurt '/tmp/failwrite', 'filesystem is full!'
Failed to write bytes to filehandle: No space left on device
in block <unit> at <unknown file> line 1
> mkdir '/tmp/failmkdir'
Failed to create directory '/tmp/failmkdir' with mode '0o777': Failed to mkdir: No space left on device
in block <unit> at <unknown file> line 1
(Programmers will need to avoid throwing away these exceptions.)
Footnotes
1 run runs an external command without involving a shell. This guarantees that the risks attendant with involving a shell are eliminated. That said, Raku also supports use of a shell (because that can be convenient and appropriate in some scenarios). See the exchange of comments under the question (eg this one) for some brief discussion of this, and the shell doc for a summary of the risk:
All shell metacharacters are interpreted by the shell, including pipes, redirects, environment variable substitutions and so on. Shell escapes are a severe security concern and can cause confusion with unusual file names. Use run if you want to be safe.
2 Foreign language adaptors for Raku (Raku modules in the Inline:: namespace) allow Raku code to use code written in other languages. These adaptors are not part of the Raku language standard, and most are barely experimental status, if that, but, conversely, the best are in great shape and allow Raku code to use foreign libraries as if they were written for Raku. (As of 2021 Inline::Perl5 is the most polished.)

Related

Paramiko, channel.recv(9999) causing confusion [duplicate]

I am using Python's Paramiko library to SSH a remote machine and fetch some output from command-line. I see a lot of junk printing along with the actual output. How to get rid of this?
chan1.send("ls\n")
output = chan1.recv(1024).decode("utf-8")
print(output)
[u'Last login: Wed Oct 21 18:08:53 2015 from 172.16.200.77\r', u'\x1b[2J\x1b[1;1H[local]cli#BENU>enable', u'[local]cli#BENU#Configure',
I want to eliminate, [2J\x1b[1;1H and u from the output. They are junk.
It's not a junk. These are ANSI escape codes that are normally interpreted by a terminal client to pretty print the output.
If the server is correctly configured, you get these only, when you use an interactive terminal, in other words, if you requested a pseudo terminal for the session (what you should not, if you are automating the session).
The Paramiko automatically requests the pseudo terminal, if you used the SSHClient.invoke_shell, as that is supposed to be used for implementing an interactive terminal. See also How do I start a shell without terminal emulation in Python Paramiko?
If you automate an execution of remote commands, you better use the SSHClient.exec_command, which does not allocate the pseudo terminal by default (unless you override by the get_pty=True argument).
stdin, stdout, stderr = client.exec_command('ls')
See also What is the difference between exec_command and send with invoke_shell() on Paramiko?
Or as a workaround, see How can I remove the ANSI escape sequences from a string in python.
Though that's rather a hack and might not be sufficient. You might have other problems with the interactive terminal, not only the escape sequences.
You particularly are probably not interested in the "Last login" message and command-prompt (cli#BENU>) either. You do not get these with the exec_command.
If you need to use the "shell" channel due to some specific requirements or limitations of the server, note that it is technically possible to use the "shell" channel without the pseudo terminal. But Paramiko SSHClient.invoke_shell does not allow that. Instead, you can create the "shell" channel manually. See Can I call Channel.invoke_shell() without calling Channel.get_pty() beforehand, when NOT using Channel.exec_command().
And finally the u is not a part of the actual string value (note that it's outside the quotes). It's an indication that the string value is in the Unicode encoding. You want that!
This is actually not junk. The u before the string indicates that this is a unicode string. The \x1b[2J\x1b[1;1H is an escape sequence. I don't know exactly what it is supposed to do, but it appears to clear the screen when I print it out.
To see what I mean, try this code:
for string in output:
print string

Can man pass an option to the roff formatter?

SYNOPSIS
From man(1):
-l
Format and display local manual files instead of
searching through the system's manual collection.
-t
Use groff -mandoc to format the manual page to stdout.
From groff_tmac(5):
papersize
This macro file is already loaded at start-up by troff so it
isn't necessary to call it explicitly. It provides an interface
to set the paper size on the command line with the option
-dpaper=size. Possible values for size are the same as
the predefined papersize values in the DESC file (only
lowercase; see groff_font(5) for more) except a7–d7.
An appended l (ell) character denotes landscape orientation.
Examples: a4, c3l, letterl.
Most output drivers need additional command-line switches -p
and -l to override the default paper length and orientation
as set in the driver-specific DESC file. For example, use the
following for PS output on A4 paper in landscape orientation:
sh# groff -Tps -dpaper=a4l -P-pa4 -P-l -ms foo.ms > foo.ps
THE PROBLEM
I would like to use these to format local and system man pages to print out, but want to switch the paper size from letter to A4. Unfortunately I couldn't find anything in man(1) about passing options to the underlying roff formatter.
Right now I can use
zcat `man -w man` | groff -tman -dpaper=a4 -P-pa4
to format man(1) on stdout, but that's kind of long and I'd rather have man build the pipeline for me if I can. In addition the above pipeline might need changing for more complicated man pages, and while I could use grog, even it doesn't detect things like accented characters (for groff's -k option), while man does (perhaps using locale settings).
The man command is typically intended only for searching for and displaying manual pages on a TTY device, not for producing typeset and paper printed output.
Depending on the host system, and/or the programs of interest, the a fully typeset printable form of a manual page can sometimes be generated when a program (or the whole system) is compiled. This is more common for system documents and less common for manual pages though.
Note that depending on which manual pages you are trying to print there may be additional steps required. Traditionally the following pipeline would be used to cover all the bases:
grap $MANFILE | pic | tbl | eqn /usr/pub/eqnchar | troff -tman -Tps | lpr -Pps
Your best solution for simplifying your command line would probably be to write a little tiny script which encapsulates what you're doing. Note that man -w might find several different filenames, so you would probably want to print each separately (or maybe only print the first one).

How to do an incremental read of binary files

TL;DR: can I do an incremental read of binary files with Red or Rebol?
I would like to use Red to process some large (13MB to 2GB) structured binary files (Kurzweil synthesizer files). I've used other languages (C, Go, Tcl, Ruby, Dart) to walk through these files, and now I'd like to do the same with Red or Rebol.
Is there a way to incrementally read binary files, byte by byte? All I see is read/binary which seems to slurp the entire file at once (or a part of a file).
I'll need to jump around a little bit, too (either peek at the next byte, or skip to the end of a section, or skip past variable length strings to the start of data).
(Yes, I could make some helpers that tracked the position and used read/part/seek.)
I would like to make a call to the low level OS read/seek if that is possible - something new to learn.
This is on macos, but a portable solution would be great.
Thanks!
PS: "open/read %abc" gives an error "*** Script Error: open does not allow file! for its port argument", even though the help message say the port argument is "port [port! file! url! block!]"
Rebol has ports for that, which are planned for 0.7.0 release in Red. So, current I/O is very basic and buffer-only, and open is a preliminary stub.
I would like to make a call to the low level OS read/seek if that is possible - something new to learn.
You can leverage Rebol or Red/System FFI as a learning excercise.
Here is how you would do it in Rebol:
>> file: open/direct/binary %file.dat
>> until [none? probe copy/part file 20]
>> close file
#{732F7072696E74657253657474696E6773312E62}
#{696E504B01022D00140006000800000021006149}
#{0910890100001103000010000000000000000000}
...
#{000000006A290000646F6350726F70732F617070}
#{2E786D6C504B0506000000000D000D0068030000}
#{292C00000000}
none
first file or pick file 1 will return the next byte value (integer!)
This even works with text files: open/lines/direct, in that case copy/part file 20 will return 20 lines, or you can use pick file 1 or first file to get the next line.
Soon this will be available on Red too.

program to reproduce itself and be useful -- not a quine

I have a program which performs a useful task. Now I want to produce the plain-text source code when the compiled executable runs, in addition to performing the original task. This is not a quine, but is probably related.
This capability would be useful in general, but my specific program is written in Fortran 90 and uses Mako Templates. When compiled it has access to the original source code files, but I want to be able to ensure that the source exists when a user runs the executable.
Is this possible to accomplish?
Here is an example of a simple Fortran 90 which does a simple task.
program exampl
implicit none
write(*,*) 'this is my useful output'
end program exampl
Can this program be modified such that it performs the same task (outputs a string when compiled) and outputs a Fortran 90 text file containing the source?
Thanks in advance
It's been so long since I have touched Fortran (and I've never dealt with Fortran 90) that I'm not certain but I see a basic approach that should work so long as the language supports string literals in the code.
Include your entire program inside itself in a block of literals. Obviously you can't include the literals within this, instead you need some sort of token that tells your program to include the block of literals.
Obviously this means you have two copies of the source, one inside the other. As this is ugly I wouldn't do it that way, but rather store your source with the include_me token in it and run it through a program that builds the nested files before you compile it. Note that this program will share a decent amount of code with the routine that recreates the code from the block of literals. If you're going to go this route I would also make the program spit out the source for this program so whoever is trying to modify the files doesn't need to deal with the two copies.
My original program (see question) is edited: add an include statement
Call this file "exampl.f90"
program exampl
implicit none
write(*,*) "this is my useful output"
open(unit=2,file="exampl_out.f90")
include "exampl_source.f90"
close(2)
end program exampl
Then another program (written in Python in this case) reads that source
import os
f=open('exampl.f90') # read in exampl.f90
g=open('exampl_source.f90','w') # and replace each line with write(*,*) 'line'
for line in f:
#print 'write(2,*) \''+line.rstrip()+'\'\n',
g.write('write(2,*) \''+line.rstrip()+'\'\n')
f.close
g.close
# then complie exampl.f90 (which includes exampl_source.f90)
os.system('gfortran exampl.f90')
os.system('/bin/rm exampl_source.f90')
Running this python script produces an executable. When the executable is run, it performs the original task AND prints the source code.

limitations of #! in scripts

It seems as if a script with #! prefix can have the interpreter name and ONLY one argument. Thus:
#!/bin/ls -l
works, but
#!/usr/bin/env ls -l
doesn't
Do you agree? Any thoughts?
Francesc
Different Unixes interpret #! differently. Here's a comprehensive-looking writeup: http://www.in-ulm.de/~mascheck/various/shebang/
It seems that the lowest common denominator across platforms is "the interpreter (which must not itself be a script) and no more than one argument".
Originally, we only had one shell on Unix. When you asked to run a command, the shell would attempt to invoke one of the exec() system calls on it. It the command was an executable, the exec would succeed and the command would run. If the exec() failed, the shell would not give up, instead it would try to interpret the command file as if it were a shell script.
Then unix got more shells and the situation became confused. Most folks would write scripts in one shell and type commands in another. And each shell had differing rules for feeding scripts to an interpreter.
This is when the “#! /” trick was invented. The idea was to let the kernel’s exec () system calls succeed with shell scripts. When the kernel tries to exec () a file, it looks at the first 4 bytes which represent an integer called a magic number. This tells the kernel if it should try to run the file or not. So “#! /” was added to magic numbers that the kernel knows and it was extended to actually be able to run shell scripts by itself. But some people could not type “#! /”, they kept leaving the space out. So the kernel was expended a bit again to allow “#!/” to work as a special 3 byte magic number.