Code Golf: Automata - puzzle

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I made the ultimate laugh generator using these rules. Can you implement it in your favorite language in a clever manner?
Rules:
On every iteration, the following transformations occur.
H -> AH
A -> HA
AA -> HA
HH -> AH
AAH -> HA
HAA -> AH
n = 0 | H
n = 1 | AH
n = 2 | HAAH
n = 3 | AHAH
n = 4 | HAAHHAAH
n = 5 | AHAHHA
n = 6 | HAAHHAAHHA
n = 7 | AHAHHAAHHA
n = 8 | HAAHHAAHHAAHHA
n = 9 | AHAHHAAHAHHA
n = ...

Lex/Flex
69 characters. In the text here, I changed tabs to 8 spaces so it would look right, but all those consecutive spaces should be tabs, and the tabs are important, so it comes out to 69 characters.
#include <stdio.h>
%%
HAA|HH|H printf("AH");
AAH|AA|A printf("HA");
For what it's worth, the generated lex.yy.c is 42736 characters, but I don't think that really counts. I can (and soon will) write a pure-C version that will be much shorter and do the same thing, but I feel that should probably be a separate entry.
EDIT:
Here's a more legit Lex/Flex entry (302 characters):
char*c,*t;
#define s(a) t=c?realloc(c,strlen(c)+3):calloc(3,1);if(t)c=t,strcat(c,#a);
%%
free(c);c=NULL;
HAA|HH|H s(AH)
AAH|AA|A s(HA)
%%
int main(void){c=calloc(2,1);if(!c)return 1;*c='H';for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
This does multiple iterations (unlike the last one, which only did one iteration, and had to be manually seeded each time, but produced the correct results) and has the advantage of being extremely horrific-looking code. I use a function macro, the stringizing operator, and two global variables. If you want an even messier version that doesn't even check for malloc() failure, it looks like this (282 characters):
char*c,*t;
#define s(a) t=c?realloc(c,strlen(c)+3):calloc(3,1);c=t;strcat(c,#a);
%%
free(c);c=NULL;
HAA|HH|H s(AH)
AAH|AA|A s(HA)
%%
int main(void){c=calloc(2,1);*c='H';for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
An even worse version could be concocted where c is an array on the stack, and we just give it a MAX_BUFFER_SIZE of some sort, but I feel that's taking this too far.
...Just kidding. 207 characters if we take the "99 characters will always be enough" mindset:
char c[99]="H";
%%
c[0]=0;
HAA|HH|H strcat(c, "AH");
AAH|AA|A strcat(c, "HA");
%%
int main(void){for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
My preference is for the one that works best (i.e. the first one that can iterate until memory runs out and checks its errors), but this is code golf.
To compile the first one, type:
flex golf.l
gcc -ll lex.yy.c
(If you have lex instead of flex, just change flex to lex. They should be compatible.)
To compile the others, type:
flex golf.l
gcc -std=c99 lex.yy.c
Or else GCC will whine about ‘for’ loop initial declaration used outside C99 mode and other crap.
Pure C answer coming up.

MATLAB (v7.8.0):
73 characters (not including formatting characters used to make it look readable)
This script ("haha.m") assumes you have already defined the variable n:
s = 'H';
for i = 1:n,
s = regexprep(s,'(H)(H|AA)?|(A)(AH)?','${[137-$1 $1]}');
end
...and here's the one-line version:
s='H';for i=1:n,s = regexprep(s,'(H)(H|AA)?|(A)(AH)?','${[137-$1 $1]}');end
Test:
>> for n=0:10, haha; disp([num2str(n) ': ' s]); end
0: H
1: AH
2: HAAH
3: AHAH
4: HAAHHAAH
5: AHAHHA
6: HAAHHAAHHA
7: AHAHHAAHHA
8: HAAHHAAHHAAHHA
9: AHAHHAAHAHHA
10: HAAHHAAHHAHAAHHA

A simple translation to Haskell:
grammar = iterate step
where
step ('H':'A':'A':xs) = 'A':'H':step xs
step ('A':'A':'H':xs) = 'H':'A':step xs
step ('A':'A':xs) = 'H':'A':step xs
step ('H':'H':xs) = 'A':'H':step xs
step ('H':xs) = 'A':'H':step xs
step ('A':xs) = 'H':'A':step xs
step [] = []
And a shorter version (122 chars, optimized down to three derivation rules + base case):
grammar=iterate s where{i 'H'='A';i 'A'='H';s(n:'A':m:x)|n/=m=m:n:s x;s(n:m:x)|n==m=(i n):n:s x;s(n:x)=(i n):n:s x;s[]=[]}
And a translation to C++ (182 chars, only does one iteration, invoke with initial state on the command line):
#include<cstdio>
#define o putchar
int main(int,char**v){char*p=v[1];while(*p){p[1]==65&&~*p&p[2]?o(p[2]),o(*p),p+=3:*p==p[1]?o(137-*p++),o(*p++),p:(o(137-*p),o(*p++),p);}return 0;}

Javascript:
120 stripping whitespace and I'm leaving it alone now!
function f(n,s){s='H';while(n--){s=s.replace(/HAA|AAH|HH?|AA?/g,function(a){return a.match(/^H/)?'AH':'HA'});};return s}
Expanded:
function f(n,s)
{
s = 'H';
while (n--)
{
s = s.replace(/HAA|AAH|HH?|AA?/g, function(a) { return a.match(/^H/) ? 'AH' : 'HA' } );
};
return s
}
that replacer is expensive!

Here's a C# example, coming in at 321 bytes if I reduce whitespace to one space between each item.
Edit: In response to #Johannes Rössel comment, I removed generics from the solution to eek out a few more bytes.
Edit: Another change, got rid of all temporary variables.
public static String E(String i)
{
return new Regex("HAA|AAH|HH|AA|A|H").Replace(i,
m => (String)new Hashtable {
{ "H", "AH" },
{ "A", "HA" },
{ "AA", "HA" },
{ "HH", "AH" },
{ "AAH", "HA" },
{ "HAA", "AH" }
}[m.Value]);
}
The rewritten solution with less whitespace, that still compiles, is 158 characters:
return new Regex("HAA|AAH|HH|AA|A|H").Replace(i,m =>(String)new Hashtable{{"H","AH"},{"A","HA"},{"AA","HA"},{"HH","AH"},{"AAH","HA"},{"HAA","AH"}}[m.Value]);
For a complete source code solution for Visual Studio 2008, a subversion repository with the necessary code, including unit tests, is available below.
Repository is here, username and password are both 'guest', without the quotes.

Ruby
This code golf is not very well specified -- I assumed that function returning n-th iteration string is best way to solve it. It has 80 characters.
def f n
a='h'
n.times{a.gsub!(/(h(h|aa)?)|(a(ah?)?)/){$1.nil?? "ha":"ah"}}
a
end
Code printing out n first strings (71 characters):
a='h';n.times{puts a.gsub!(/(h(h|aa)?)|(a(ah?)?)/){$1.nil?? "ha":"ah"}}

Erlang
241 bytes and ready to run:
> erl -noshell -s g i -s init stop
AHAHHAAHAHHA
-module(g).
-export([i/0]).
c("HAA"++T)->"AH"++c(T);
c("AAH"++T)->"HA"++c(T);
c("HH"++T)->"AH"++c(T);
c("AA"++T)->"HA"++c(T);
c("A"++T)->"HA"++c(T);
c("H"++T)->"AH"++c(T);
c([])->[].
i(0,L)->L;
i(N,L)->i(N-1,c(L)).
i()->io:format(i(9,"H"))
Could probably be improved.

Perl 168 characters.
(not counting unnecessary newlines)
perl -E'
($s,%m)=qw[H H AH A HA AA HA HH AH AAH HA HAA AH];
sub p{say qq[n = $_[0] | $_[1]]};p(0,$s);
for(1..9){$s=~s/(H(AA|H)?|A(AH?)?)/$m{$1}/g;p($_,$s)}
say q[n = ...]'
De-obfuscated:
use strict;
use warnings;
use 5.010;
my $str = 'H';
my %map = (
H => 'AH',
A => 'HA',
AA => 'HA',
HH => 'AH',
AAH => 'HA',
HAA => 'AH'
);
sub prn{
my( $n, $str ) = #_;
say "n = $n | $str"
}
prn( 0, $str );
for my $i ( 1..9 ){
$str =~ s(
(
H(?:AA|H)? # HAA | HH | H
|
A(?:AH?)? # AAH | AA | A
)
){
$map{$1}
}xge;
prn( $i, $str );
}
say 'n = ...';
Perl 150 characters.
(not counting unnecessary newlines)
perl -E'
$s="H";
sub p{say qq[n = $_[0] | $_[1]]};p(0,$s);
for(1..9){$s=~s/(?|(H)(?:AA|H)?|(A)(?:AH?)?)/("H"eq$1?"A":"H").$1/eg;p($_,$s)}
say q[n = ...]'
De-obfuscated
#! /usr/bin/env perl
use strict;
use warnings;
use 5.010;
my $str = 'H';
sub prn{
my( $n, $str ) = #_;
say "n = $n | $str"
}
prn( 0, $str );
for my $i ( 1..9 ){
$str =~ s{(?|
(H)(?:AA|H)? # HAA | HH | H
|
(A)(?:AH?)? # AAH | AA | A
)}{
( 'H' eq $1 ?'A' :'H' ).$1
}egx;
prn( $i, $str );
}
say 'n = ...';

Python (150 bytes)
import re
N = 10
s = "H"
for n in range(N):
print "n = %d |"% n, s
s = re.sub("(HAA|HH|H)|AAH|AA|A", lambda m: m.group(1) and "AH" or "HA",s)
Output
n = 0 | H
n = 1 | AH
n = 2 | HAAH
n = 3 | AHAH
n = 4 | HAAHHAAH
n = 5 | AHAHHA
n = 6 | HAAHHAAHHA
n = 7 | AHAHHAAHHA
n = 8 | HAAHHAAHHAAHHA
n = 9 | AHAHHAAHAHHA

Here is a very simple C++ version:
#include <iostream>
#include <sstream>
using namespace std;
#define LINES 10
#define put(t) s << t; cout << t
#define r1(o,a,c0) \
if(c[0]==c0) {put(o); s.unget(); s.unget(); a; continue;}
#define r2(o,a,c0,c1) \
if(c[0]==c0 && c[1]==c1) {put(o); s.unget(); a; continue;}
#define r3(o,a,c0,c1,c2) \
if(c[0]==c0 && c[1]==c1 && c[2]==c2) {put(o); a; continue;}
int main() {
char c[3];
stringstream s;
put("H\n\n");
for(int i=2;i<LINES*2;) {
s.read(c,3);
r3("AH",,'H','A','A');
r3("HA",,'A','A','H');
r2("AH",,'H','H');
r2("HA",,'A','A');
r1("HA",,'A');
r1("AH",,'H');
r1("\n",i++,'\n');
}
}
It's not exactly code-golf (it could be made a lot shorter), but it works. Change LINES to however many lines you want printed (note: it will not work for 0). It will print output like this:
H
AH
HAAH
AHAH
HAAHHAAH
AHAHHA
HAAHHAAHHA
AHAHHAAHHA
HAAHHAAHHAAHHA
AHAHHAAHAHHA

ANSI C99
Coming in at a brutal 306 characters:
#include <stdio.h>
#include <string.h>
char s[99]="H",t[99]={0};int main(){for(int n=0;n<10;n++){int i=0,j=strlen(s);printf("n = %u | %s\n",n,s);strcpy(t,s);s[0]=0;for(;i<j;){if(t[i++]=='H'){t[i]=='H'?i++:t[i+1]=='A'?i+=2:1;strcat(s,"AH");}else{t[i]=='A'?i+=1+(t[i+1]=='H'):1;strcat(s,"HA");}}}return 0;}
There are too many nested ifs and conditional operators for me to effectively reduce this with macros. Believe me, I tried. Readable version:
#include <stdio.h>
#include <string.h>
char s[99] = "H", t[99] = {0};
int main()
{
for(int n = 0; n < 10; n++)
{
int i = 0, j = strlen(s);
printf("n = %u | %s\n", n, s);
strcpy(t, s);
s[0] = 0;
/*
* This was originally just a while() loop.
* I tried to make it shorter by making it a for() loop.
* I failed.
* I kept the for() loop because it looked uglier than a while() loop.
* This is code golf.
*/
for(;i<j;)
{
if(t[i++] == 'H' )
{
// t[i] == 'H' ? i++ : t[i+1] == 'A' ? i+=2 : 1;
// Oh, ternary ?:, how do I love thee?
if(t[i] == 'H')
i++;
else if(t[i+1] == 'A')
i+= 2;
strcat(s, "AH");
}
else
{
// t[i] == 'A' ? i += 1 + (t[i + 1] == 'H') : 1;
if(t[i] == 'A')
if(t[++i] == 'H')
i++;
strcat(s, "HA");
}
}
}
return 0;
}
I may be able to make a shorter version with strncmp() in the future, but who knows? We'll see what happens.

In python:
def l(s):
H=['HAA','HH','H','AAH','AA','A']
L=['AH']*3+['HA']*3
for i in [3,2,1]:
if s[:i] in H: return L[H.index(s[:i])]+l(s[i:])
return s
def a(n,s='H'):
return s*(n<1)or a(n-1,l(s))
for i in xrange(0,10):
print '%d: %s'%(i,a(i))
First attempt: 198 char of code, I'm sure it can get smaller :D

REBOL, 150 characters. Unfortunately REBOL is not a language conducive to code golf, but 150 characters ain't too shabby, as Adam Sandler says.
This assumes the loop variable m has already been defined.
s: "H" r: "" z:[some[["HAA"|"HH"|"H"](append r "AH")|["AAH"|"AA"|"A"](append r "HA")]to end]repeat n m[clear r parse s z print["n =" n "|" s: copy r]]
And here it is with better layout:
s: "H"
r: ""
z: [
some [
[ "HAA" | "HH" | "H" ] (append r "AH")
| [ "AAH" | "AA" | "A" ] (append r "HA")
]
to end
]
repeat n m [
clear r
parse s z
print ["n =" n "|" s: copy r]
]

F#: 184 chars
Seems to map pretty cleanly to F#:
type grammar = H | A
let rec laugh = function
| 0,l -> l
| n,l ->
let rec loop = function
|H::A::A::x|H::H::x|H::x->A::H::loop x
|A::A::H::x|A::A::x|A::x->H::A::loop x
|x->x
laugh(n-1,loop l)
Here's a run in fsi:
> [for a in 0 .. 9 -> a, laugh(a, [H])] |> Seq.iter (fun (a, b) -> printfn "n = %i: %A" a b);;
n = 0: [H]
n = 1: [A; H]
n = 2: [H; A; A; H]
n = 3: [A; H; A; H]
n = 4: [H; A; A; H; H; A; A; H]
n = 5: [A; H; A; H; H; A]
n = 6: [H; A; A; H; H; A; A; H; H; A]
n = 7: [A; H; A; H; H; A; A; H; H; A]
n = 8: [H; A; A; H; H; A; A; H; H; A; A; H; H; A]
n = 9: [A; H; A; H; H; A; A; H; A; H; H; A]

Related

decoding base64 encoded text with POSIX awk

In a bash script that I'm writing for Linux/Solaris I need to decode more than a hundred thousand base64-encoded text strings, and, because I don't wanna massively fork a non-portable base64 binary from awk, I wrote a function that does the decoding.
Here's the code of my base64_decode function:
function base64_decode(str, out,i,n,v) {
out = ""
if ( ! ("A" in _BASE64_DECODE_c2i) )
for (i = 1; i <= 64; i++)
_BASE64_DECODE_c2i[substr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",i,1)] = i-1
i = 0
n = length(str)
while (i <= n) {
v = _BASE64_DECODE_c2i[substr(str,++i,1)] * 262144 + \
_BASE64_DECODE_c2i[substr(str,++i,1)] * 4096 + \
_BASE64_DECODE_c2i[substr(str,++i,1)] * 64 + \
_BASE64_DECODE_c2i[substr(str,++i,1)]
out = out sprintf("%c%c%c", int(v/65536), int(v/256), v)
}
return out
}
Which works fine:
printf '%s\n' SmFuZQ== amRvZQ== |
LANG=C command -p awk '
{ print base64_decode($0) }
function base64_decode(...) {...}
'
Jane
jdoe
SIMPLIFIED REAL-LIFE EXAMPLE THAT DOESN'T WORK AS EXPECTED
I want to get the givenName of the users that are members of GroupCode = 025496 from the output of ldapsearch -LLL -o ldif-wrap=no ... '(|(uid=*)(GroupCode=*))' uid givenName sn GroupCode memberUid:
dn: uid=jsmith,ou=users,dc=example,dc=com
givenName: John
sn: SMITH
uid: jsmith
dn: uid=jdoe,ou=users,dc=example,dc=com
uid: jdoe
givenName:: SmFuZQ==
sn:: RE9F
dn: cn=group1,ou=groups,dc=example,dc=com
GroupCode: 025496
memberUid:: amRvZQ==
memberUid: jsmith
Here would be an awk for doing so:
LANG=C command -p awk -F '\n' -v RS='' -v GroupCode=025496 '
{
delete attrs
for (i = 2; i <= NF; i++) {
match($i,/::? /)
key = substr($i,1,RSTART-1)
val = substr($i,RSTART+RLENGTH)
if (RLENGTH == 3)
val = base64_decode(val)
attrs[key] = ((key in attrs) ? attrs[key] SUBSEP val : val)
}
if ( /\nuid:/ )
givenName[ attrs["uid"] ] = attrs["givenName"]
else
memberUid[ attrs["GroupCode"] ] = attrs["memberUid"]
}
END {
n = split(memberUid[GroupCode],uid,SUBSEP)
for ( i = 1; i <= n; i++ )
print givenName[ uid[i] ]
}
function base64_decode(...) { ... }
'
On BSD and Solaris the result is:
Jane
John
While on Linux it is:
John
I don't know where the issue might be; is there something wrong with the base64_decode function and/or the code that uses it?
Your function generates NUL bytes when its argument (encoded string) ends with padding characters (=s). Below is a corrected version of your while loop:
while (i < n) {
v = _BASE64_DECODE_c2i[substr(str,1+i,1)] * 262144 + \
_BASE64_DECODE_c2i[substr(str,2+i,1)] * 4096 + \
_BASE64_DECODE_c2i[substr(str,3+i,1)] * 64 + \
_BASE64_DECODE_c2i[substr(str,4+i,1)]
i += 4
if (v%256 != 0)
out = out sprintf("%c%c%c", int(v/65536), int(v/256), v)
else if (int(v/256)%256 != 0)
out = out sprintf("%c%c", int(v/65536), int(v/256))
else
out = out sprintf("%c", int(v/65536))
}
Note that if the decoded bytes contains an embedded NUL then this approach may not work properly.
Problem is within base64_decode function that outputs some junk characters on gnu-awk.
You can use this awk code that uses system provided base64 utility as an alternative:
{
delete attrs
for (i = 2; i <= NF; i++) {
match($i,/::? /)
key = substr($i,1,RSTART-1)
val = substr($i,RSTART+RLENGTH)
if (RLENGTH == 3) {
cmd = "echo " val " | base64 -di"
cmd | getline val # should also check exit code here
}
attrs[key] = ((key in attrs) ? attrs[key] SUBSEP val : val)
}
if ( /\nuid:/ )
givenName[ attrs["uid"] ] = attrs["givenName"]
else
memberUid[ attrs["GroupCode"] ] = attrs["memberUid"]
}
END {
n = split(memberUid[GroupCode],uid,SUBSEP)
for ( i = 1; i <= n; i++ )
print givenName[ uid[i] ]
}
I have tested this on gnu and BSD awk versions and I am getting expected output in all the cases.
If you cannot use external base64 utility then I suggest you take a look here for awk version of base64 decode.
This answer is for reference
Here's a working base64_decode function (thanks #MNejatAydin for pointing out the issue(s) in the original one):
function base64_decode(str, out,bits,n,i,c1,c2,c3,c4) {
out = ""
# One-time initialization during the first execution
if ( ! ("A" in _BASE64) )
for (i = 1; i <= 64; i++)
# The "_BASE64" array associates a character to its base64 index
_BASE64[substr("ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/",i,1)] = i-1
# Decoding the input string
n = length(str)
i = 0
while ( i < n ) {
c1 = substr(str, ++i, 1)
c2 = substr(str, ++i, 1)
c3 = substr(str, ++i, 1)
c4 = substr(str, ++i, 1)
bits = _BASE64[c1] * 262144 + _BASE64[c2] * 4096 + _BASE64[c3] * 64 + _BASE64[c4]
if ( c4 != "=" )
out = out sprintf("%c%c%c", bits/65536, bits/256, bits)
else if ( c3 != "=" )
out = out sprintf("%c%c", bits/65536, bits/256)
else
out = out sprintf("%c", bits/65536)
}
return out
}
WARNING: the function requires LANG=C
It also doesn't check that the input is a valid base64 string; for that you can add a simple condition like:
match( str, "^([a-zA-Z/-9+]{4})*([a-zA-Z/-9+]{2}[a-zA-Z/-9+=]{2})?$" )
Interestingly, the code is 2x faster than base64decode.awk, but it's only 3x faster than forking the base64 binary from inside awk.
notes:
In a base64 encoded string, 4 bytes represent 3 bytes of data; the input have to be processed by groups of 4 characters.
Multiplying and dividing an integer by a power of two is equivalent to do bitwise left and right shifts operations.
262144 is 2^18, so N * 262144 is equivalent to N << 18
4096 is 2^12, so N * 4096 is equivalent to N << 12
64 id 2^6, so N * 4096 is equivalent to N << 6
65536 is 2^16, so N / 65536 (integer division) is equivalent to N >> 16
256 is 2^8, so N / 256 (integer division) is equivalent to N >> 8
What happens in printf "%c", N:
N is first converted to an integer (if need be) and then, WITH LANG=C, the 8 least significant bits are taken in for the %c formatting.
How the possible padding of one or two trailing = characters at the end of the encoded string is handled:
If the 4th char isn't = (i.e. there's no padding) then the result should be 3 bytes of data.
If the 4th char is = and the 3rd char isn't = then there's 2 bytes of of data to decode.
If the fourth char is = and the third char is = then there's only one byte of data.

AWK Convert Decimal to Binary

I want to use AWK to convert a list of decimal numbers in a file to binary but there seems to be no built-in method. Sample file is as below:
134218506
134218250
134217984
1610612736
16384
33554432
Here is an awk way, functionized for your pleasure:
awk '
function d2b(d, b) {
while(d) {
b=d%2b
d=int(d/2)
}
return(b)
}
{
print d2b($0)
}' file
Output of the first three records:
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000
You can try Perl one-liner
$ cat hamdani.txt
134218506
134218250
134217984
134217984
1610612736
16384
33554432
$ perl -nle ' printf("%b\n",$_) ' hamdani.txt
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000
1000000000000000000100000000
1100000000000000000000000000000
100000000000000
10000000000000000000000000
$
You can try with dc :
# -f infile : Use infile for data
# after -e , it is there are the dc command
dc -f infile -e '
z # number of values
sa # keep in register a
2
o # set the output radix to 2 : binary
[
Sb # keep all the value of infile in the register b
# ( b is use here as a stack)
z
0 <M # until there is no more value
] sM # define macro M in [ and ]
lMx # execute macro M to populate stack b
[
Lb # get all values one at a time from stack b
p # print this value in binary
la # get the number of value
1
- # decremente it
d # duplicate
sa # keep one in register a
0<N # the other is use here
]sN # define macro N
lNx' # execute macro N to print each values in binary
Here's an approach that works by first converting the decimal to hex and then converting each hex character to it's binary equivalent:
$ cat dec2bin.awk
BEGIN {
h2b["0"] = "0000"; h2b["8"] = "1000"
h2b["1"] = "0001"; h2b["9"] = "1001"
h2b["2"] = "0010"; h2b["a"] = "1010"
h2b["3"] = "0011"; h2b["b"] = "1011"
h2b["4"] = "0100"; h2b["c"] = "1100"
h2b["5"] = "0101"; h2b["d"] = "1101"
h2b["6"] = "0110"; h2b["e"] = "1110"
h2b["7"] = "0111"; h2b["f"] = "1111"
}
{ print dec2bin($0) }
function hex2bin(hex, n,i,bin) {
n = length(hex)
for (i=1; i<=n; i++) {
bin = bin h2b[substr(hex,i,1)]
}
sub(/^0+/,"",bin)
return bin
}
function dec2bin(dec, hex, bin) {
hex = sprintf("%x\n", dec)
bin = hex2bin(hex)
return bin
}
$ awk -f dec2bin.awk file
1000000000000000001100001010
1000000000000000001000001010
1000000000000000000100000000
1100000000000000000000000000000
100000000000000
10000000000000000000000000
# gawk binary number functions
# RPC 09OCT2022
# convert an 8 bit binary number to an integer
function bin_to_n(i)
{
n = 0;
#printf(">> %s:", i);
for (k = 1; k < 9; k++) {
n = n * 2;
b = substr(i, k, 1);
if (b == "1") {
n = n + 1;
}
}
return (n);
}
# convert a number to a binary number
function dectobin(n)
{
printf("dectobin: n in %d ",n);
binstring = "0b"; # some c compilers allow 0bXXXXXXXX format numbers
bn = 128;
for(k=0;k<8;k++) {
if (n >= bn) {
binstring = binstring "1";
n = n - bn;
} else {
binstring = binstring "0"
}
printf(" bn %d",bn);
bn = bn / 2;
}
return binstring;
}
BEGIN {
FS = " ";
# gawk (I think) has no atoi() funciton or equiv. So a table of all
# chars (well 256 ascii) can be used with the index function to get
# round this
for (i = 0; i < 255; i++) {
table = sprintf("%s%c", table, i);
}
}
{
# assume on stdin a buffer of 8 bit binary numbers "01000001 01000010" is AB etc
for (i = 1; i <= NF; i++)
printf("bin-num#%d: %x --> %c\n", i, bin_to_n($i), bin_to_n($i));
s = "ABC123string to test";
for (i = 0; i < length(s); i++) {
nn = index(table, substr(s,i+1,1))-1;
printf("substr :%s:%x:",ss,nn);
printf(" :%d: %s\n", i, dectobin(nn));
}
}
on top of what others have already mentioned, this function has a rapid shortcut for non-negative integer powers of 2
—- (since they always have a binary pattern of /^[1][0]*$/ )
version 1 : processing in 3-bit chunks instead of bit-by-bit :
{m,g}awk '
BEGIN {
1 CONVFMT="%.250g"
1 _^=OFMT="%.25g"
}
($++NF=________v1($_))^!_
function ________v1(__,___,_,____,_____)
{
6 if (+__==(_+=_^=____="")^(___=log(__)/log(_))) { # 2
2 return \
___<=_^_^_ \
? (_+_*_*_)^___ \
: sprintf("%.f%0*.f",--_,___,--_)
}
4 ___=(!_!_!_!!_) (_^((_____=_*_*_)+_)-_^_^_+(++_))
4 gsub("..", "&0&1", ___)
41 while(__) {
41 ____ = substr(___,
__%_____*_+(__=int(__/_____))^!_,_)____
}
4 return substr(__=____, index(__, _^(! _)))
}'
version 2 : first use sprintf() to convert to octals, before mapping to binary
function ________v2(__,___,_,____,_____)
{
6 if (+__==(_+=_^=____="")^(___=log(__)/log(_))) { # 2
2 return \
___<=_^_^_ \
? (_+_*_*_)^___ \
: sprintf("%.f%0*.f",--_,___,--_)
}
4 ___=(!_!_!_!!_) (_^((_____=_*_*_)+_)-_^_^_+(++_))
4 gsub("..", "&0&1", ___)
4 _____=___
4 __=sprintf("%o%.*o", int(__/(___=++_^(_*--_+_))),
_*_+!!_, __%___)
4 sub("^[0]+", "", __)
41 for (___=length(__); ___; ___--) {
41 ____ = substr(_____, substr(__,
___,!!_)*_ + !!_,_)____
}
4 return substr(____, index(____,!!_))
}
|
134218506 1000000000000000001100001010
134218250 1000000000000000001000001010
134217984 1000000000000000000100000000
1610612736 1100000000000000000000000000000
16384 100000000000000
33554432 10000000000000000000000000
version 3 : reasonably zippy (29.5 MB/s throughput on mawk2) version by using a caching array and processing 8-bits each round
ouputs are zero-padded to minimum 8 binary digits wide
.
{m,g,n}awk '
1 function ________(_______,_, __,____,______)
{
1 split(_=__=____=______="", _______, _)
2 for (_^=_<_; -_<=+_; _--) {
4 for (__^=_<_; -__<=+__; __--) {
8 for (____^=_<_; -____<=+____; ____--) {
16 for (______^=_<_; -______<=+______; ______--) {
16 _______[_+_+_+_+_+_+_+_+__+__+\
__+__+____+____+______]=\
(_)__ (____)______
}
}
}
}
1 return _^(_<_)
}
BEGIN {
1 CONVFMT = "%." ((_+=(_^=_<_)+(_+=_))*_)(!_)"g"
1 OFMT = "%." (_*_) "g"
1 _ = ________(_____)
}
($++NF=___($_))^!_
function ___(__,____,_,______)
{
6 if ((__=int(__))<(______=\
(_*=_+=_+=_^=____="")*_)) {
return _____[int(__/_)]_____[__%_]
}
16 do { ____=_____[int(__/_)%_]_____[__%_]____
} while (______<=(__=int(__/______)))
6 return int(_____[int(__/_)%_]\
_____[ (__) %_])____
}
You should not use awk for this but bc:
$ bc <<EOF
ibase=10
obase=2
$(cat file)
EOF
or
bc <<< $(awk 'BEGIN{ print "ibase=10; obase=2"}1' file)

Why do Perl 6 state variable behave differently for different files?

I have 2 test files. In one file, I want to extract the middle section using a state variable as a switch, and in the other file, I want to use a state variable to hold the sum of numbers seen.
File one:
section 0; state 0; not needed
= start section 1 =
state 1; needed
= end section 1 =
section 2; state 2; not needed
File two:
1
2
3
4
5
Code to process file one:
cat file1 | perl6 -ne 'state $x = 0; say " x is ", $x; if $_ ~~ m/ start / { $x = 1; }; .say if $x == 1; if $_ ~~ m/ end / { $x = 2; }'
and the result is with errors:
x is (Any)
Use of uninitialized value of type Any in numeric context
in block at -e line 1
x is (Any)
= start section 1 =
x is 1
state 1; needed
x is 1
= end section 1 =
x is 2
x is 2
And the code to process file two is
cat file2 | perl6 -ne 'state $x=0; if $_ ~~ m/ \d+ / { $x += $/.Str; } ; say $x; '
and the results are as expected:
1
3
6
10
15
What make the state variable fail to initialize in the first code, but okay in the second code?
I found that in the first code, if I make the state variable do something, such as addition, then it works. Why so?
cat file1 | perl6 -ne 'state $x += 0; say " x is ", $x; if $_ ~~ m/ start / { $x = 1; }; .say if $x == 1; if $_ ~~ m/ end / { $x = 2; }'
# here, $x += 0 instead of $x = 0; and the results have no errors:
x is 0
x is 0
= start section 1 =
x is 1
state 1; needed
x is 1
= end section 1 =
x is 2
x is 2
Thanks for any help.
This was answered in smls's comment:
Looks like a Rakudo bug. Simpler test-case:
echo Hello | perl6 -ne 'state $x = 42; dd $x'.
It seems that top-level state variables are
not initialized when the -n or -p switch is used. As a work-around, you can manually initialize the variable in a separate statement, using the //= (assign if undefined) operator:
state $x; $x //= 42;

SPIN: interpret the error trace

I try to solve with spin the task about the farmer, wolf, goat and cabbage.
So, I found the folowing promela description:
#define fin (all_right_side == true)
#define wg (g_and_w == false)
#define gc (g_and_c == false)
ltl ltl_0 { <> fin && [] ( wg && gc ) }
bool all_right_side, g_and_w, g_and_c;
active proctype river()
{
bit f = 0,
w = 0,
g = 0,
c = 0;
all_right_side = false;
g_and_w = false;
g_and_c = false;
printf("MSC: f %c w %c g %c c %c \n", f, w, g, c);
do
:: (f==1) && (f == w) && (f ==g) && (f == c) ->
all_right_side = true;
break;
:: else ->
if
:: (f == w) ->
f = 1 - f;
w = 1 - w;
:: (f == c) ->
f = 1 - f;
w = 1 - c;
:: (f == g) ->
f = 1 - f;
w = 1 - g;
:: (true) ->
f = 1 - f;
fi;
printf("M f %c w %c g %c c %c \n", f, w, g, c);
if
:: (f != g && g == c) ->
g_and_c = true;
:: (f != g && g == w) ->
g_and_w = true;
::else ->
skip
fi
od;
printf ("MSC: OK!\n")
}
I add there an LTL-formula: ltl ltl_0 { <> fin && [] ( wg && gc ) }
to verify, than the wolf wouldn't eat a goat, and the goat wouldn't eat the cabbage. I want to get an example, how the farmer can transport all his needs (w-g-c) without loss.
When I run verification, I get the following result:
State-vector 20 byte, depth reached 59, errors: 1
64 states, stored
23 states, matched
87 transitions (= stored+matched)
0 atomic steps
hash conflicts: 0 (resolved)
This means that the program has generated an example for me. But I cannot interpret it.
The content of *.pml.trial file is:enter image description here
Please, help me to interpret.
There are a few ways you can go about interpreting the trace.
Use iSpin:
go to Simulate/Play
in Mode, select Guided and enter the name of your trail file
Run
This will show, step by step, the actions taken by each of the processes, including info such as process number, proctype name, line number of instruction executed, code of instruction executed.
Do the same with spin:
Use the command
spin -t -p xyz.pml
Understand the trail file syntax:
each line on the file is one step taken by the simulator.
the first column is just serial numbers.
The second column is process numbers (pids). (eg init will be 0, the first process it starts/runs will be 1 and so on.)
The third column is transition number. If you want to get just an idea of what is happening, you can look at the pids and go over the instructions
In order to "interpret" it, you could modify your source code so that each time an action is taken something intellegibile is printed on stdout.
e.g.:
:: (f == w) ->
if
:: f == 0 -> printf("LEFT ---[farmer, wolf]--> RIGHT\n");
:: f == 1 -> printf("LEFT <--[farmer, wolf]--- RIGHT\n");
:: else -> skip;
fi;
f = 1 - f;
w = 1 - w;
+ something similar for the cases (f == c), (f == g) and (true).
Note: your source code already provides printf("M f %c w %c g %c c %c \n", f, w, g, c);, which can be used to interpret the counter-example if you keep in mind that 0 means left and 1 means right. I would prefer a more verbose tracing, though.
After you have done this for each possible transition, you can see what happens within your counter-example by running spin in the following way
~$ spin -t file_name.pml
The option -t replays the latest trail found by spin upon the violation of some assertion/property.

What is the best way to add two numbers without using the + operator?

A friend and I are going back and forth with brain-teasers and I have no idea how to solve this one. My assumption is that it's possible with some bitwise operators, but not sure.
In C, with bitwise operators:
#include<stdio.h>
int add(int x, int y) {
int a, b;
do {
a = x & y;
b = x ^ y;
x = a << 1;
y = b;
} while (a);
return b;
}
int main( void ){
printf( "2 + 3 = %d", add(2,3));
return 0;
}
XOR (x ^ y) is addition without carry. (x & y) is the carry-out from each bit. (x & y) << 1 is the carry-in to each bit.
The loop keeps adding the carries until the carry is zero for all bits.
int add(int a, int b) {
const char *c=0;
return &(&c[a])[b];
}
No + right?
int add(int a, int b)
{
return -(-a) - (-b);
}
CMS's add() function is beautiful. It should not be sullied by unary negation (a non-bitwise operation, tantamount to using addition: -y==(~y)+1). So here's a subtraction function using the same bitwise-only design:
int sub(int x, int y) {
unsigned a, b;
do {
a = ~x & y;
b = x ^ y;
x = b;
y = a << 1;
} while (a);
return b;
}
Define "best". Here's a python version:
len(range(x)+range(y))
The + performs list concatenation, not addition.
Java solution with bitwise operators:
// Recursive solution
public static int addR(int x, int y) {
if (y == 0) return x;
int sum = x ^ y; //SUM of two integer is X XOR Y
int carry = (x & y) << 1; //CARRY of two integer is X AND Y
return addR(sum, carry);
}
//Iterative solution
public static int addI(int x, int y) {
while (y != 0) {
int carry = (x & y); //CARRY is AND of two bits
x = x ^ y; //SUM of two bits is X XOR Y
y = carry << 1; //shifts carry to 1 bit to calculate sum
}
return x;
}
Cheat. You could negate the number and subtract it from the first :)
Failing that, look up how a binary adder works. :)
EDIT: Ah, saw your comment after I posted.
Details of binary addition are here.
Note, this would be for an adder known as a ripple-carry adder, which works, but does not perform optimally. Most binary adders built into hardware are a form of fast adder such as a carry-look-ahead adder.
My ripple-carry adder works for both unsigned and 2's complement integers if you set carry_in to 0, and 1's complement integers if carry_in is set to 1. I also added flags to show underflow or overflow on the addition.
#define BIT_LEN 32
#define ADD_OK 0
#define ADD_UNDERFLOW 1
#define ADD_OVERFLOW 2
int ripple_add(int a, int b, char carry_in, char* flags) {
int result = 0;
int current_bit_position = 0;
char a_bit = 0, b_bit = 0, result_bit = 0;
while ((a || b) && current_bit_position < BIT_LEN) {
a_bit = a & 1;
b_bit = b & 1;
result_bit = (a_bit ^ b_bit ^ carry_in);
result |= result_bit << current_bit_position++;
carry_in = (a_bit & b_bit) | (a_bit & carry_in) | (b_bit & carry_in);
a >>= 1;
b >>= 1;
}
if (current_bit_position < BIT_LEN) {
*flags = ADD_OK;
}
else if (a_bit & b_bit & ~result_bit) {
*flags = ADD_UNDERFLOW;
}
else if (~a_bit & ~b_bit & result_bit) {
*flags = ADD_OVERFLOW;
}
else {
*flags = ADD_OK;
}
return result;
}
Go based solution
func add(a int, b int) int {
for {
carry := (a & b) << 1
a = a ^ b
b = carry
if b == 0 {
break
}
}
return a
}
same solution can be implemented in Python as follows, but there is some problem about number represent in Python, Python has more than 32 bits for integers. so we will use a mask to obtain the last 32 bits.
Eg: if we don't use mask we won't get the result for numbers (-1,1)
def add(a,b):
mask = 0xffffffff
while b & mask:
carry = a & b
a = a ^ b
b = carry << 1
return (a & mask)
Why not just incremet the first number as often, as the second number?
The reason ADD is implememted in assembler as a single instruction, rather than as some combination of bitwise operations, is that it is hard to do. You have to worry about the carries from a given low order bit to the next higher order bit. This is stuff that the machines do in hardware fast, but that even with C, you can't do in software fast.
Here's a portable one-line ternary and recursive solution.
int add(int x, int y) {
return y == 0 ? x : add(x ^ y, (x & y) << 1);
}
I saw this as problem 18.1 in the coding interview.
My python solution:
def foo(a, b):
"""iterate through a and b, count iteration via a list, check len"""
x = []
for i in range(a):
x.append(a)
for i in range(b):
x.append(b)
print len(x)
This method uses iteration, so the time complexity isn't optimal.
I believe the best way is to work at a lower level with bitwise operations.
In python using bitwise operators:
def sum_no_arithmetic_operators(x,y):
while True:
carry = x & y
x = x ^ y
y = carry << 1
if y == 0:
break
return x
Adding two integers is not that difficult; there are many examples of binary addition online.
A more challenging problem is floating point numbers! There's an example at http://pages.cs.wisc.edu/~smoler/x86text/lect.notes/arith.flpt.html
Was working on this problem myself in C# and couldn't get all test cases to pass. I then ran across this.
Here is an implementation in C# 6:
public int Sum(int a, int b) => b != 0 ? Sum(a ^ b, (a & b) << 1) : a;
Implemented in same way as we might do binary addition on paper.
int add(int x, int y)
{
int t1_set, t2_set;
int carry = 0;
int result = 0;
int mask = 0x1;
while (mask != 0) {
t1_set = x & mask;
t2_set = y & mask;
if (carry) {
if (!t1_set && !t2_set) {
carry = 0;
result |= mask;
} else if (t1_set && t2_set) {
result |= mask;
}
} else {
if ((t1_set && !t2_set) || (!t1_set && t2_set)) {
result |= mask;
} else if (t1_set && t2_set) {
carry = 1;
}
}
mask <<= 1;
}
return (result);
}
Improved for speed would be below::
int add_better (int x, int y)
{
int b1_set, b2_set;
int mask = 0x1;
int result = 0;
int carry = 0;
while (mask != 0) {
b1_set = x & mask ? 1 : 0;
b2_set = y & mask ? 1 : 0;
if ( (b1_set ^ b2_set) ^ carry)
result |= mask;
carry = (b1_set & b2_set) | (b1_set & carry) | (b2_set & carry);
mask <<= 1;
}
return (result);
}
It is my implementation on Python. It works well, when we know the number of bytes(or bits).
def summ(a, b):
#for 4 bytes(or 4*8 bits)
max_num = 0xFFFFFFFF
while a != 0:
a, b = ((a & b) << 1), (a ^ b)
if a > max_num:
b = (b&max_num)
break
return b
You can do it using bit-shifting and the AND operation.
#include <stdio.h>
int main()
{
unsigned int x = 3, y = 1, sum, carry;
sum = x ^ y; // Ex - OR x and y
carry = x & y; // AND x and y
while (carry != 0) {
carry = carry << 1; // left shift the carry
x = sum; // initialize x as sum
y = carry; // initialize y as carry
sum = x ^ y; // sum is calculated
carry = x & y; /* carry is calculated, the loop condition is
evaluated and the process is repeated until
carry is equal to 0.
*/
}
printf("%d\n", sum); // the program will print 4
return 0;
}
The most voted answer will not work if the inputs are of opposite sign. The following however will. I have cheated at one place, but only to keep the code a bit clean. Any suggestions for improvement welcome
def add(x, y):
if (x >= 0 and y >= 0) or (x < 0 and y < 0):
return _add(x, y)
else:
return __add(x, y)
def _add(x, y):
if y == 0:
return x
else:
return _add((x ^ y), ((x & y) << 1))
def __add(x, y):
if x < 0 < y:
x = _add(~x, 1)
if x > y:
diff = -sub(x, y)
else:
diff = sub(y, x)
return diff
elif y < 0 < x:
y = _add(~y, 1)
if y > x:
diff = -sub(y, x)
else:
diff = sub(y, x)
return diff
else:
raise ValueError("Invalid Input")
def sub(x, y):
if y > x:
raise ValueError('y must be less than x')
while y > 0:
b = ~x & y
x ^= y
y = b << 1
return x
Here is the solution in C++, you can find it on my github here: https://github.com/CrispenGari/Add-Without-Integers-without-operators/blob/master/main.cpp
int add(int a, int b){
while(b!=0){
int sum = a^b; // add without carrying
int carry = (a&b)<<1; // carrying without adding
a= sum;
b= carry;
}
return a;
}
// the function can be writen as follows :
int add(int a, int b){
if(b==0){
return a; // any number plus 0 = that number simple!
}
int sum = a ^ b;// adding without carrying;
int carry = (a & b)<<1; // carry, without adding
return add(sum, carry);
}
This can be done using Half Adder.
Half Adder is method to find sum of numbers with single bit.
A B SUM CARRY A & B A ^ B
0 0 0 0 0 0
0 1 1 0 0 1
1 0 1 0 0 1
1 1 0 1 0 0
We can observe here that SUM = A ^ B and CARRY = A & B
We know CARRY is always added at 1 left position from where it was
generated.
so now add ( CARRY << 1 ) in SUM, and repeat this process until we get
Carry 0.
int Addition( int a, int b)
{
if(B==0)
return A;
Addition( A ^ B, (A & B) <<1 )
}
let's add 7 (0111) and 3 (0011) answer will be 10 (1010)
A = 0100 and B = 0110
A = 0010 and B = 1000
A = 1010 and B = 0000
final answer is A.
I implemented this in Swift, I am sure someone will benefit from
var a = 3
var b = 5
var sum = 0
var carry = 0
while (b != 0) {
sum = a ^ b
carry = a & b
a = sum
b = carry << 1
}
print (sum)
You can do it iteratively or recursively. Recursive:-
public int getSum(int a, int b) {
return (b==0) ? a : getSum(a^b, (a&b)<<1);
}
Iterative:-
public int getSum(int a, int b) {
int c=0;
while(b!=0) {
c=a&b;
a=a^b;
b=c<<1;
}
return a;
}
time complexity - O(log b)
space complexity - O(1)
for further clarifications if not clear, refer leetcode or geekForGeeks explanations.
I'll interpret this question as forbidding the +,-,* operators but not ++ or -- since the question specified operator and not character (and also because that's more interesting).
A reasonable solution using the increment operator is as follows:
int add(int a, int b) {
if (b == 0)
return a;
if (b > 0)
return add(++a, --b);
else
return add(--a, ++b);
}
This function recursively nudges b towards 0, while giving a the same amount to keep the sum the same.
As an additional challenge, let's get rid of the second if block to avoid a conditional jump. This time we'll need to use some bitwise operators:
int add(int a, int b) {
if(!b)
return a;
int gt = (b > 0);
int m = -1 << (gt << 4) << (gt << 4);
return (++a & --b & 0)
| add( (~m & a--) | (m & --a),
(~m & b++) | (m & ++b)
);
}
The function trace is identical; a and b are nudged between each add call just like before.
However, some bitwise magic is employed to drop the if statement while continuing to not use +,-,*:
A mask m is set to 0xFFFFFFFF (-1 in signed decimal) if b is positive, or 0x00000000 if b is negative.
The reason for shifting the mask left by 16 twice instead a single shift left by 32 is because shifting by >= the size of the value is undefined behavior.
The final return takes a bit of thought to fully appreciate:
Consider this technique to avoid a branch when deciding between two values. Of the values, one is multiplied by the boolean while the other is multiplied by the inverse, and the results are summed like so:
double naiveFoodPrice(int ownPetBool) {
if(ownPetBool)
return 23.75;
else
return 10.50;
}
double conditionlessFoodPrice(int ownPetBool) {
double result = ownPetBool*23.75 + (!ownPetBool)*10.50;
}
This technique works great in most cases. For us, the addition operator can easily be substituted for the bitwise or | operator without changing the behavior.
The multiplication operator is also not allowed for this problem. This is the reason for our earlier mask value - a bitwise and & with the mask will achieve the same effect as multiplying by the original boolean.
The nature of the unary increment and decrement operators halts our progress.
Normally, we would easily be able to choose between an a which was incremented by 1 and an a which was decremented by 1.
However, because the increment and decrement operators modify their operand, our conditionless code will end up always performing both operations - meaning that the values of a and b will be tainted before we finish using them.
One way around this is to simply create new variables which each contain the original values of a and b, allowing a clean slate for each operation. I consider this boring, so instead we will adjust a and b in a way that does not affect the rest of the code (++a & --b & 0) in order to make full use of the differences between x++ and ++x.
We can now get both possible values for a and b, as the unary operators modifying the operands' values now works in our favor. Our techniques from earlier help us choose the correct versions of each, and we now have a working add function. :)
Python codes:
(1)
add = lambda a,b : -(-a)-(-b)
use lambda function with '-' operator
(2)
add= lambda a,b : len(list(map(lambda x:x,(i for i in range(-a,b)))))