How do I print text in Common Lisp so that I could format it with escape sequences (akin to display in Racket)? - formatting

Ho do I print a formatted output in Common Lisp?
In Racket I do it with display, like so:
(display "\33[3min italics\33[m\n")
I've tried with (format t "~ain italics~a" "\33[3m" "\33[m") but it does not work. Neither does this: (format t "~cin italics~c" #\33[3m #\33[m).

The main issue here is how to get the proper sequence of characters. \33 is octal for ascii char 27 or #\Esc in Common Lisp.
(format t "~C[3min italics~C[m~%" #\Esc #\Esc)
would do what you want.
But you could do better than that. There is a library called cl-interpol which demonstrates flexibility of Common Lisp by modifying the reader so you could use the already familiar syntax.
For example:
* (ql:quickload 'cl-interpol)
To load "cl-interpol":
Load 1 ASDF system:
cl-interpol
; Loading "cl-interpol"
...
(CL-INTERPOL)
* (named-readtables:in-readtable :interpol-syntax)
#<NAMED-READTABLE :INTERPOL-SYNTAX {1002E6D6B3}>
* (format t #?"\33[3min italics\33[m\n")
in italics
NIL

Related

Fractions for variables names in Julia

In julia you can write subscripts by \_ for variable names. I was wondering if there is anything similar for writing fractions in variable names. Something like \frac{}{} in LaTeX. I understand this may be harder as it takes two arguments. If there is none, I will use /. But in this case I would like to use some enclosures to make clear what is being differentiated. I assume () is not usable? [] or {} would be ok?
The subscripts or other non-latin names you see in Julia code are just normal unicodes the same as "regular" names. the LaTeX commands are only a function of Julia REPL to remember and input them.
As for unicode, in principle you can represent some simple fractions like ⁽²⁺ⁱ⁾⁄₍ₛ₊ₜ₎, using the ⁄ (U+2044 Fraction slash) symbol and subscripts and superscripts. The rendering depends on your font, but do not expect a verticle layout in any current fonts.
However, Julia recognizes ⁄ (U+2044 Fraction slash, not the / in your keyboard) as "invalid character" when used along during parsing. The same applies to \not, which can only be used in conjunction with some operators, so it's not an option too.
As for the brackets and the normal /, they are operators and are parsed differently. However, there is an (ugly) way to circumvent this: you can use macros to bypass the parsing and use strings as variable names. For example:
julia> macro n_str(name)
esc(Symbol(name))
end
#n_str (macro with 1 method)
julia> n"∂(2x + 3)/∂x" = 2
2
julia> 2n"∂(2x + 3)/∂x"
4

How to display UTF8 string in OS X Terminal

I can't believe I couldn't find a solution to this very simple issue. I have a command line tool in Objective C, and need to display UTF8 strings (with non-English characters) in the console. I can't use NSLog as it also display process information, PID, timestamp etc. too. printf doesn't handle non-English characters well.
How can I print non-English characters in the Terminal, without any timestamps? Am I missing something really obvious here, or is such an extremely simple task really non-trivial in OS X?
I've tried:
printf: Doesn't display non-English characters.
NSLog: Displays PID/timestamp, which I don't want.
DLog (from https://stackoverflow.com/a/17311835/811405): Doesn't display non-English characters.
This works just fine:
printf("%s\n", [#"Can Poyrazoğlu" UTF8String]);
The macro you've tried to use depends on CFShow which doesn't print Unicode characters but only their escape codes. More information regarding this behaviour here.
So you could either use something else for your macro instead of CFShow to print to console without any timestamps or you could use an NSLog replacement library I wrote, Xcode Logger and use its XLog_NH logger which prints only the output without any other information.
Using stdio:
puts([#"Can Poyrazoğlu" UTF8String]);
Using write:
const char* example = [#"Can Poyrazoğlu" UTF8String];
write(STDOUT_FILENO, example, strlen(example));

How does %NNN$hhn work in a format string?

I am trying out a classic format string vulnerability. I want to know how exactly the following format string works:
"%NNN$hhn" where 'N' is any number.
E.g: printf("%144$hhn",....);
How does it work and how do I use this to overwrite any address I want with arbitrary value?
Thanks and Regards,
Hrishikesh Murali
It's a POSIX extension (not found in C99) which will simply allow you to select which argument from the argument list to use for the source of the data.
With regular printf, each % format specifier grabs the current argument from the list and advances the "pointer" to the next one. That means if you want to print a single value in two different ways, you need something like:
printf ("%c %d\n", chVal, chVal);
By using positional specifiers, you can do this as:
printf ("%1$c %1$d\n", chVal);
because both format strings will use the first argument as their source.
Another example on the wikipedia page is:
printf ("%2$d %2$#x; %1$d %1$#x",16,17);
which will give you the output:
17 0x11; 16 0x10
It basically allows you to disconnect the order of the format specifiers from the provided values, letting you bounce around the argument list in any way you want, using the values over and over again, in any arbitrary order.
Now whether you can use this as an user attack vector, I'm doubtful, since it only adds a means for the programmer to change the source of the data, not where the data is sent to.
It's no less secure than the regular style printf and I can see no real vulnerabilities unless you have the power to change the format string somehow. But, if you could do that, the regular printf would also be wide open to abuse.

How to identify binary and text files using Smalltalk

I want to verify that a given file in a path is of type text file, i.e. not binary, i.e. readable by a human. I guess reading first characters and check each character with :
isAlphaNumeric
isSpecial
isSeparator
isOctetCharacter ???
but joining all those testing methods with and: [ ... and: [ ... and: [ ] ] ] seems not to be very smalltalkish. Any suggestion for a more elegant way?
(There is a Python version here How to identify binary and text files using Python? which could be useful but syntax and implementation looks like C.)
only heuristics; you can never be really certain...
For ascii, the following may do:
|isPlausibleAscii numChecked|
isPlausibleAscii :=
[:char |
((char codePoint between:32 and:127)
or:[ char isSeparator ])
].
numChecked := text size min: 1024.
isPossiblyText := text from:1 to:numChecked conform: isPlausibleAscii.
For unicode (UTF8 ?) things become more difficult; you could then try to convert. If there is a conversion error, assume binary.
PS: if you don't have from:to:conform:, replace by (copyFrom:to:) conform:
PPS: if you don't have conform: , try allSatisfy:
All text contains more space than you'd expect to see in a binary file, and some encodings (UTF16/32) will contain lots of 0's for common languages.
A smalltalky solution would be to hide the gory details in method on Standard/MultiByte-FileStream, #isProbablyText would probably be a good choice.
It would essentially do the following:
- store current state if you intend to use it later, reset to start (Set Latin1 converter if you use a MultiByteStream)
Iterate over N next characters (where N is an appropriate number)
Encounter a non-printable ascii char? It's probably binary, so return false. (not a special selector, use a map, implement a new method on Character or something)
Increase 2 counters if appropriate, one for space characters, and another for zero characters.
If loop finishes, return whether either of the counters have been read a statistically significant amount
TLDR; Use a method to hide the gory details, otherwise it's pretty much the same.

Regular expression for extracting a number

I would like to be able to extract a number from within a string formatted as follows:
"<[1085674730]> hello foo1, how are you doing?"
I'm a novice with regular expressions, I only want to be able to extract a number that is enclosed in the greater/less-than and bracket symbols, but I'm not sure how to go about it. I have to match numeric digits only, but I'm not sure what syntax is used for only searching within these symbols.
UPDATE:
Thank you all for you input, sorry for not being more specific, as I explained to kiamlaluno, I'm using VB.Net as the language for my application. I was wondering why some of the implementations were not working. In fact, the only one that did work was the one described by Matthew Flaschen. But that captures the symbols around the number as well as the number itself. I would like to only capture the number that is encased in the symbols and filter out the symbols themselves.
Use:
<\[(\d+)\]>
This is tested with ECMAScript regex.
It means:
\[ - literal [
( - open capturing group
\d - digit
+ - one or more
) - close capturing group
\] - literal ]
The overall functionality is to capture one or more digits surrounded by the given characters.
Combine Mathews post with lookarounds http://www.regular-expressions.info/lookaround.html. This will exclude the prefix and suffix.
(?<=<\[)\d+(?=\]>)
I didn't test this regex but it should be very close to what you need. Double check at the link provided.
Hope this helps!
$subject = "<[1085674730]> hello foo1, how are you doing?";
preg_match('/<\[(\d+)\]>/', $subject, $matches);
$matches[1] will contain the number you are looking for.
Use:
/<\[([[:digit:]]+)\]>/
If your implementation doesn't support the handy [:digit:] syntax, then use this:
/<\[([\d]+)\]>/
And if your implementation doesn't support the handy \d syntax, then use this:
/<\[([0-9]+)\]>/