What is the problem in this yacc grammar? - yacc

I have a yacc parser in which commands like "abc=123" (VAR=VAR),
abc=[1 2 3] (VAR=value_series/string) and abc=[[123]] can be parsed.
I need to parse abc=[xyz=[1 2 3] jkl=[3 4 5]].This grammar is failing due to ambiguity between rule 2 (I guess, it couldn't differentiate between value_series and the new rule.
I have tried a case:
VAR_NAME EQUAL quote_or_brace model EQUAL quote_or_brace value_series quote_or_brace net EQUAL quote_or_brace value_series quote_or_brace quote_or_brace
It didn't work.
series:
| PLUS series
{
}
| series VAR_NAME EQUAL VAR_NAME
{
delete [] $2;
delete [] $4;
}
| series VAR_NAME EQUAL quote_or_brace value_series quote_or_brace
{
delete [] $2;
}
| series VAR_NAME EQUAL quote_or_brace quote_or_brace value_series quote_or_brace quote_or_brace
{
delete [] $2;
}
| error
{
setErrorMsg(string("error"));
YYABORT;
};

If it was me, I would probably write a set of rules similar to this:
assignment_list
: assignment
| assignment assignment_list
;
assignment
: VAR_NAME '=' assignment_rhs
;
assignment_rhs
: expression
| '[' opt_expression_list ']' /* allows var1 = [ ... ] */
;
opt_expression_list
: /* empty */
| expression_list
;
expression_list
: expression
| expression expression_list
;
expression
: VAR_NAME /* a variable name, as in var1 = var2 */
| NUMBER /* some kind of number, as in var1 = 123 */
| STRING /* quoted string, as in var1 = "foo" */
| assignment /* allows nesting of assignments, as in var1 = [var2 = 4] */
/* also allows things like var1 = var2 = 123 */
;
Note that this isn't tested in anyway, and that I'm not fully sure about the recursion from expression to assigment.

Related

karate.exec() breaks the argument by space when passed via table

When I pass a text or string as a variable from table to feature, for some reason karate.exec is breaking the argument based on space.
I have main feature where the code is
#Example 1
* def calcModel = '":: decimal calcModel = get_calc_model();"'
#Example 2
* text calcModel =
"""
:: decimal calcModel = get_calc_model();
return calcModel;
"""
* table calcDetails
| field | code | desc |
| 31 | '":: return get_name();" | '"this is name"' |
| 32 | calcModel | '"this is the calc model"' |
* call read('classpath:scripts/SetCalcModel.feature') calcDetails
Inside SetCalcModel.feature the code is
* def setCalcModel = karate.exec('/opt/local/SetCalcModel.sh --timeout 100 -field ' + field + ' -code ' + code + ' -description '+desc)
For row 1 of the table it works fine and executes following command:
command: [/opt/local/SetCalcModel.sh, --timeout, 100, -field, 31, -code, :: decimal calcModel = get_calc_model();, -description, this is the calc model], working dir: null
For row 2 it breaks with following command:
command: [/opt/local/SetCalcModel.sh, --timeout, 100, -field, 32, -code, ::, decimal, calcModel, =, get_calc_model();, -description, this is the calc model], working dir: null
I have tried this with example 1 and 2 and it keeps doing the same.
I have also tried passing line json as argument to karate.exec(), that also has same issue.
Is there a workaround here??
There is a way to pass arguments as an array of strings, use that approach instead.
For example:
* karate.exec({ args: [ 'curl', 'https://httpbin.org/anything' ] })
Refer: https://stackoverflow.com/a/73230200/143475

Using forall() predicate in minizinc as assignment statement without 'constraint'

I have a Minizinc program for generating the optimal charge/discharge schedule for a grid-connected battery, given a set of prices by time-interval.
My program works (sort of; subject to some caveats), but my question is about two 'constraint' statements which are really just assignment statements:
constraint forall(t in 2..T)(MW_SETPOINT[t-1] - SALE[t] = MW_SETPOINT[t]);
constraint forall(t in 1..T)(PROFIT[t] = SALE[t] * PRICE[t]);
These just mean Energy SALES is the delta in MW_SETPOINT from t-1 to 1, and PROFIT is SALE * PRICE for each interval. So it seems counterintuitive to me to declare them as 'constraints'. But I've been unable to formulate them as assignment statements without throwing syntax errors.
Question:
Is there a more idiomatic way to declare such assignment statements for an array which is a function of other params/variables? Or is making assignments for arrays in constraints the recommended/idiomatic way to do it in Minizinc?
Full program for context:
% PARAMS
int: MW_CAPACITY = 10;
array[int] of float: PRICE;
% DERIVED PARAMS
int: STARTING_MW = MW_CAPACITY div 2; % integer division
int: T = length(PRICE);
% DECISION VARIABLE - MW SETPOINT EACH INTERVAL
array[1..T] of var 0..MW_CAPACITY: MW_SETPOINT;
% DERIVED/INTERMEDIATE VARIABLES
array[1..T] of var -1*MW_CAPACITY..MW_CAPACITY: SALE;
array[1..T] of var float: PROFIT;
var float: NET_PROFIT = sum(PROFIT);
% CONSTRAINTS
%% If start at 5MW, and sell 5 first interval, setpoint for first interval is 0
constraint MW_SETPOINT[1] = STARTING_MW - SALE[1];
%% End where you started; opt schedule from arbitrage means no net MW over time
constraint MW_SETPOINT[T] = STARTING_MW;
%% these are really justassignment statements for SALE & PROFIT
constraint forall(t in 2..T)(MW_SETPOINT[t-1] - SALE[t] = MW_SETPOINT[t]);
constraint forall(t in 1..T)(PROFIT[t] = SALE[t] * PRICE[t]);
% OBJECTIVE: MAXIMIZE REVENUE
solve maximize NET_PROFIT;
output["DAILY_PROFIT: " ++ show(NET_PROFIT) ++
"\nMW SETPOINTS: " ++ show(MW_SETPOINT) ++
"\nMW SALES: " ++ show(SALE) ++
"\n$/MW PRICES: " ++ show(PRICE)++
"\nPROFITS: " ++ show(PROFIT)
];
It can be run with
minizinc opt_sched_hindsight.mzn --solver org.minizinc.mip.coin-bc -D "PRICE = [29.835, 29.310470000000002, 28.575059999999997, 28.02416, 28.800690000000003, 32.41052, 34.38542, 29.512390000000003, 25.66587, 25.0499, 26.555529999999997, 28.149440000000002, 30.216509999999996, 32.32415, 31.406609999999997, 36.77642, 41.94735, 51.235209999999995, 50.68137, 64.54481, 48.235170000000004, 40.27663, 34.93675, 31.10404];"```
You can play with Array Comprehensions: (quote from the docs)
Array comprehensions have this syntax:
<array-comp> ::= "[" <expr> "|" <comp-tail> "]"
For example (with the literal equivalents on the right):
[2*i | i in 1..5] % [2, 4, 6, 8, 10]
Array comprehensions have more flexible type and inst requirements than set comprehensions (see Set Comprehensions).
Array comprehensions are allowed over a variable set with finite type,
the result is an array of optional type, with length equal to the
cardinality of the upper bound of the variable set. For example:
var set of 1..5: x;
array[int] of var opt int: y = [ i * i | i in x ];
The length of array will be 5.
Array comprehensions are allowed where the where-expression
is a var bool. Again the resulting array is of optional
type, and of length equal to that given by the generator expressions. For example:
var int x;
array[int] of var opt int: y = [ i | i in 1..10 where i != x ];
The length of the array will be 10.
The indices of an evaluated simple array comprehension are
implicitly 1..n, where n is the length of the evaluated
comprehension.
Example:
int: MW_CAPACITY = 10;
int: STARTING_MW = MW_CAPACITY div 2;
array [int] of float: PRICE = [1.0, 2.0, 3.0, 4.0];
int: T = length(PRICE);
array [1..T] of var -1*MW_CAPACITY..MW_CAPACITY: SALE;
array [1..T] of var 0..MW_CAPACITY: MW_SETPOINT = let {
int: min_i = min(index_set(PRICE));
} in
[STARTING_MW - sum([SALE[j] | j in min_i..i])
| i in index_set(PRICE)];
array [1..T] of var float: PROFIT =
[SALE[i] * PRICE[i]
| i in index_set(PRICE)];
solve satisfy;
Output:
~$ minizinc test.mzn
SALE = array1d(1..4, [-10, -5, 0, 0]);
----------
Notice that index_set(PRICE) is nothing else but 1..T and that min(index_set(PRICE)) is nothing else but 1, so one could write the above array comprehensions also as
array [1..T] of var 0..MW_CAPACITY: MW_SETPOINT =
[STARTING_MW - sum([SALE[j] | j in 1..i])
| i in 1..T];
array [1..T] of var float: PROFIT =
[SALE[i] * PRICE[i]
| i in 1..T];

Why do Perl 6 state variable behave differently for different files?

I have 2 test files. In one file, I want to extract the middle section using a state variable as a switch, and in the other file, I want to use a state variable to hold the sum of numbers seen.
File one:
section 0; state 0; not needed
= start section 1 =
state 1; needed
= end section 1 =
section 2; state 2; not needed
File two:
1
2
3
4
5
Code to process file one:
cat file1 | perl6 -ne 'state $x = 0; say " x is ", $x; if $_ ~~ m/ start / { $x = 1; }; .say if $x == 1; if $_ ~~ m/ end / { $x = 2; }'
and the result is with errors:
x is (Any)
Use of uninitialized value of type Any in numeric context
in block at -e line 1
x is (Any)
= start section 1 =
x is 1
state 1; needed
x is 1
= end section 1 =
x is 2
x is 2
And the code to process file two is
cat file2 | perl6 -ne 'state $x=0; if $_ ~~ m/ \d+ / { $x += $/.Str; } ; say $x; '
and the results are as expected:
1
3
6
10
15
What make the state variable fail to initialize in the first code, but okay in the second code?
I found that in the first code, if I make the state variable do something, such as addition, then it works. Why so?
cat file1 | perl6 -ne 'state $x += 0; say " x is ", $x; if $_ ~~ m/ start / { $x = 1; }; .say if $x == 1; if $_ ~~ m/ end / { $x = 2; }'
# here, $x += 0 instead of $x = 0; and the results have no errors:
x is 0
x is 0
= start section 1 =
x is 1
state 1; needed
x is 1
= end section 1 =
x is 2
x is 2
Thanks for any help.
This was answered in smls's comment:
Looks like a Rakudo bug. Simpler test-case:
echo Hello | perl6 -ne 'state $x = 42; dd $x'.
It seems that top-level state variables are
not initialized when the -n or -p switch is used. As a work-around, you can manually initialize the variable in a separate statement, using the //= (assign if undefined) operator:
state $x; $x //= 42;

What does "fragment" mean in ANTLR?

What does fragment mean in ANTLR?
I've seen both rules:
fragment DIGIT : '0'..'9';
and
DIGIT : '0'..'9';
What is the difference?
A fragment is somewhat akin to an inline function: It makes the grammar more readable and easier to maintain.
A fragment will never be counted as a token, it only serves to simplify a grammar.
Consider:
NUMBER: DIGITS | OCTAL_DIGITS | HEX_DIGITS;
fragment DIGITS: '1'..'9' '0'..'9'*;
fragment OCTAL_DIGITS: '0' '0'..'7'+;
fragment HEX_DIGITS: '0x' ('0'..'9' | 'a'..'f' | 'A'..'F')+;
In this example, matching a NUMBER will always return a NUMBER to the lexer, regardless of if it matched "1234", "0xab12", or "0777".
See item 3
According to the Definitive Antlr4 references book :
Rules prefixed with fragment can be called only from other lexer rules; they are not tokens in their own right.
actually they'll improve readability of your grammars.
look at this example :
STRING : '"' (ESC | ~["\\])* '"' ;
fragment ESC : '\\' (["\\/bfnrt] | UNICODE) ;
fragment UNICODE : 'u' HEX HEX HEX HEX ;
fragment HEX : [0-9a-fA-F] ;
STRING is a lexer using fragment rule like ESC .Unicode is used in Esc rule and Hex is used in Unicode fragment rule.
ESC and UNICODE and HEX rules can't be used explicitly.
The Definitive ANTLR 4 Reference (Page 106):
Rules prefixed with fragment can
be called only from other lexer rules; they are not tokens in their own right.
Abstract Concepts:
Case1: ( if I need the RULE1, RULE2, RULE3 entities or group info )
rule0 : RULE1 | RULE2 | RULE3 ;
RULE1 : [A-C]+ ;
RULE2 : [DEF]+ ;
RULE3 : ('G'|'H'|'I')+ ;
Case2: ( if I don't care RULE1, RULE2, RULE3, I just focus on RULE0 )
RULE0 : [A-C]+ | [DEF]+ | ('G'|'H'|'I')+ ;
// RULE0 is a terminal node.
// You can't name it 'rule0', or you will get syntax errors:
// 'A-C' came as a complete surprise to me while matching alternative
// 'DEF' came as a complete surprise to me while matching alternative
Case3: ( is equivalent to Case2, making it more readable than Case2)
RULE0 : RULE1 | RULE2 | RULE3 ;
fragment RULE1 : [A-C]+ ;
fragment RULE2 : [DEF]+ ;
fragment RULE3 : ('G'|'H'|'I')+ ;
// You can't name it 'rule0', or you will get warnings:
// warning(125): implicit definition of token RULE1 in parser
// warning(125): implicit definition of token RULE2 in parser
// warning(125): implicit definition of token RULE3 in parser
// and failed to capture rule0 content (?)
Differences between Case1 and Case2/3 ?
The lexer rules are equivalent
Each of RULE1/2/3 in Case1 is a capturing group, similar to Regex:(X)
Each of RULE1/2/3 in Case3 is a non-capturing group, similar to Regex:(?:X)
Let's see a concrete example.
Goal: identify [ABC]+, [DEF]+, [GHI]+ tokens
input.txt
ABBCCCDDDDEEEEE ABCDE
FFGGHHIIJJKK FGHIJK
ABCDEFGHIJKL
Main.py
import sys
from antlr4 import *
from AlphabetLexer import AlphabetLexer
from AlphabetParser import AlphabetParser
from AlphabetListener import AlphabetListener
class MyListener(AlphabetListener):
# Exit a parse tree produced by AlphabetParser#content.
def exitContent(self, ctx:AlphabetParser.ContentContext):
pass
# (For Case1 Only) enable it when testing Case1
# Exit a parse tree produced by AlphabetParser#rule0.
def exitRule0(self, ctx:AlphabetParser.Rule0Context):
print(ctx.getText())
# end-of-class
def main():
file_name = sys.argv[1]
input = FileStream(file_name)
lexer = AlphabetLexer(input)
stream = CommonTokenStream(lexer)
parser = AlphabetParser(stream)
tree = parser.content()
print(tree.toStringTree(recog=parser))
listener = MyListener()
walker = ParseTreeWalker()
walker.walk(listener, tree)
# end-of-def
main()
Case1 and results:
Alphabet.g4 (Case1)
grammar Alphabet;
content : (rule0|ANYCHAR)* EOF;
rule0 : RULE1 | RULE2 | RULE3 ;
RULE1 : [A-C]+ ;
RULE2 : [DEF]+ ;
RULE3 : ('G'|'H'|'I')+ ;
ANYCHAR : . -> skip;
Result:
# Input data (for reference)
# ABBCCCDDDDEEEEE ABCDE
# FFGGHHIIJJKK FGHIJK
# ABCDEFGHIJKL
$ python3 Main.py input.txt
(content (rule0 ABBCCC) (rule0 DDDDEEEEE) (rule0 ABC) (rule0 DE) (rule0 FF) (rule0 GGHHII) (rule0 F) (rule0 GHI) (rule0 ABC) (rule0 DEF) (rule0 GHI) <EOF>)
ABBCCC
DDDDEEEEE
ABC
DE
FF
GGHHII
F
GHI
ABC
DEF
GHI
Case2/3 and results:
Alphabet.g4 (Case2)
grammar Alphabet;
content : (RULE0|ANYCHAR)* EOF;
RULE0 : [A-C]+ | [DEF]+ | ('G'|'H'|'I')+ ;
ANYCHAR : . -> skip;
Alphabet.g4 (Case3)
grammar Alphabet;
content : (RULE0|ANYCHAR)* EOF;
RULE0 : RULE1 | RULE2 | RULE3 ;
fragment RULE1 : [A-C]+ ;
fragment RULE2 : [DEF]+ ;
fragment RULE3 : ('G'|'H'|'I')+ ;
ANYCHAR : . -> skip;
Result:
# Input data (for reference)
# ABBCCCDDDDEEEEE ABCDE
# FFGGHHIIJJKK FGHIJK
# ABCDEFGHIJKL
$ python3 Main.py input.txt
(content ABBCCC DDDDEEEEE ABC DE FF GGHHII F GHI ABC DEF GHI <EOF>)
Did you see "capturing groups" and "non-capturing groups" parts?
Let's see the concrete example2.
Goal: identify octal / decimal / hexadecimal numbers
input.txt
0
123
1~9999
001~077
0xFF, 0x01, 0xabc123
Number.g4
grammar Number;
content
: (number|ANY_CHAR)* EOF
;
number
: DECIMAL_NUMBER
| OCTAL_NUMBER
| HEXADECIMAL_NUMBER
;
DECIMAL_NUMBER
: [1-9][0-9]*
| '0'
;
OCTAL_NUMBER
: '0' '0'..'9'+
;
HEXADECIMAL_NUMBER
: '0x'[0-9A-Fa-f]+
;
ANY_CHAR
: .
;
Main.py
import sys
from antlr4 import *
from NumberLexer import NumberLexer
from NumberParser import NumberParser
from NumberListener import NumberListener
class Listener(NumberListener):
# Exit a parse tree produced by NumberParser#Number.
def exitNumber(self, ctx:NumberParser.NumberContext):
print('%8s, dec: %-8s, oct: %-8s, hex: %-8s' % (ctx.getText(),
ctx.DECIMAL_NUMBER(), ctx.OCTAL_NUMBER(), ctx.HEXADECIMAL_NUMBER()))
# end-of-def
# end-of-class
def main():
input = FileStream(sys.argv[1])
lexer = NumberLexer(input)
stream = CommonTokenStream(lexer)
parser = NumberParser(stream)
tree = parser.content()
print(tree.toStringTree(recog=parser))
listener = Listener()
walker = ParseTreeWalker()
walker.walk(listener, tree)
# end-of-def
main()
Result:
# Input data (for reference)
# 0
# 123
# 1~9999
# 001~077
# 0xFF, 0x01, 0xabc123
$ python3 Main.py input.txt
(content (number 0) \n (number 123) \n (number 1) ~ (number 9999) \n (number 001) ~ (number 077) \n (number 0xFF) , (number 0x01) , (number 0xabc123) \n <EOF>)
0, dec: 0 , oct: None , hex: None
123, dec: 123 , oct: None , hex: None
1, dec: 1 , oct: None , hex: None
9999, dec: 9999 , oct: None , hex: None
001, dec: None , oct: 001 , hex: None
077, dec: None , oct: 077 , hex: None
0xFF, dec: None , oct: None , hex: 0xFF
0x01, dec: None , oct: None , hex: 0x01
0xabc123, dec: None , oct: None , hex: 0xabc123
If you add the modifier 'fragment' to DECIMAL_NUMBER, OCTAL_NUMBER, HEXADECIMAL_NUMBER, you won't be able to capture the number entities (since they are not tokens anymore). And the result will be:
$ python3 Main.py input.txt
(content 0 \n 1 2 3 \n 1 ~ 9 9 9 9 \n 0 0 1 ~ 0 7 7 \n 0 x F F , 0 x 0 1 , 0 x a b c 1 2 3 \n <EOF>)
This blog post has a very clear example where fragment makes a significant difference:
grammar number;
number: INT;
DIGIT : '0'..'9';
INT : DIGIT+;
The grammar will recognize '42' but not '7'. You can fix it by making digit a fragment (or moving DIGIT after INT).

Code Golf: Automata

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I made the ultimate laugh generator using these rules. Can you implement it in your favorite language in a clever manner?
Rules:
On every iteration, the following transformations occur.
H -> AH
A -> HA
AA -> HA
HH -> AH
AAH -> HA
HAA -> AH
n = 0 | H
n = 1 | AH
n = 2 | HAAH
n = 3 | AHAH
n = 4 | HAAHHAAH
n = 5 | AHAHHA
n = 6 | HAAHHAAHHA
n = 7 | AHAHHAAHHA
n = 8 | HAAHHAAHHAAHHA
n = 9 | AHAHHAAHAHHA
n = ...
Lex/Flex
69 characters. In the text here, I changed tabs to 8 spaces so it would look right, but all those consecutive spaces should be tabs, and the tabs are important, so it comes out to 69 characters.
#include <stdio.h>
%%
HAA|HH|H printf("AH");
AAH|AA|A printf("HA");
For what it's worth, the generated lex.yy.c is 42736 characters, but I don't think that really counts. I can (and soon will) write a pure-C version that will be much shorter and do the same thing, but I feel that should probably be a separate entry.
EDIT:
Here's a more legit Lex/Flex entry (302 characters):
char*c,*t;
#define s(a) t=c?realloc(c,strlen(c)+3):calloc(3,1);if(t)c=t,strcat(c,#a);
%%
free(c);c=NULL;
HAA|HH|H s(AH)
AAH|AA|A s(HA)
%%
int main(void){c=calloc(2,1);if(!c)return 1;*c='H';for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
This does multiple iterations (unlike the last one, which only did one iteration, and had to be manually seeded each time, but produced the correct results) and has the advantage of being extremely horrific-looking code. I use a function macro, the stringizing operator, and two global variables. If you want an even messier version that doesn't even check for malloc() failure, it looks like this (282 characters):
char*c,*t;
#define s(a) t=c?realloc(c,strlen(c)+3):calloc(3,1);c=t;strcat(c,#a);
%%
free(c);c=NULL;
HAA|HH|H s(AH)
AAH|AA|A s(HA)
%%
int main(void){c=calloc(2,1);*c='H';for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
An even worse version could be concocted where c is an array on the stack, and we just give it a MAX_BUFFER_SIZE of some sort, but I feel that's taking this too far.
...Just kidding. 207 characters if we take the "99 characters will always be enough" mindset:
char c[99]="H";
%%
c[0]=0;
HAA|HH|H strcat(c, "AH");
AAH|AA|A strcat(c, "HA");
%%
int main(void){for(int n=0;n<10;n++)printf("n = %d | %s\n",n,c),yy_scan_string(c),yylex();return 0;}int yywrap(){return 1;}
My preference is for the one that works best (i.e. the first one that can iterate until memory runs out and checks its errors), but this is code golf.
To compile the first one, type:
flex golf.l
gcc -ll lex.yy.c
(If you have lex instead of flex, just change flex to lex. They should be compatible.)
To compile the others, type:
flex golf.l
gcc -std=c99 lex.yy.c
Or else GCC will whine about ‘for’ loop initial declaration used outside C99 mode and other crap.
Pure C answer coming up.
MATLAB (v7.8.0):
73 characters (not including formatting characters used to make it look readable)
This script ("haha.m") assumes you have already defined the variable n:
s = 'H';
for i = 1:n,
s = regexprep(s,'(H)(H|AA)?|(A)(AH)?','${[137-$1 $1]}');
end
...and here's the one-line version:
s='H';for i=1:n,s = regexprep(s,'(H)(H|AA)?|(A)(AH)?','${[137-$1 $1]}');end
Test:
>> for n=0:10, haha; disp([num2str(n) ': ' s]); end
0: H
1: AH
2: HAAH
3: AHAH
4: HAAHHAAH
5: AHAHHA
6: HAAHHAAHHA
7: AHAHHAAHHA
8: HAAHHAAHHAAHHA
9: AHAHHAAHAHHA
10: HAAHHAAHHAHAAHHA
A simple translation to Haskell:
grammar = iterate step
where
step ('H':'A':'A':xs) = 'A':'H':step xs
step ('A':'A':'H':xs) = 'H':'A':step xs
step ('A':'A':xs) = 'H':'A':step xs
step ('H':'H':xs) = 'A':'H':step xs
step ('H':xs) = 'A':'H':step xs
step ('A':xs) = 'H':'A':step xs
step [] = []
And a shorter version (122 chars, optimized down to three derivation rules + base case):
grammar=iterate s where{i 'H'='A';i 'A'='H';s(n:'A':m:x)|n/=m=m:n:s x;s(n:m:x)|n==m=(i n):n:s x;s(n:x)=(i n):n:s x;s[]=[]}
And a translation to C++ (182 chars, only does one iteration, invoke with initial state on the command line):
#include<cstdio>
#define o putchar
int main(int,char**v){char*p=v[1];while(*p){p[1]==65&&~*p&p[2]?o(p[2]),o(*p),p+=3:*p==p[1]?o(137-*p++),o(*p++),p:(o(137-*p),o(*p++),p);}return 0;}
Javascript:
120 stripping whitespace and I'm leaving it alone now!
function f(n,s){s='H';while(n--){s=s.replace(/HAA|AAH|HH?|AA?/g,function(a){return a.match(/^H/)?'AH':'HA'});};return s}
Expanded:
function f(n,s)
{
s = 'H';
while (n--)
{
s = s.replace(/HAA|AAH|HH?|AA?/g, function(a) { return a.match(/^H/) ? 'AH' : 'HA' } );
};
return s
}
that replacer is expensive!
Here's a C# example, coming in at 321 bytes if I reduce whitespace to one space between each item.
Edit: In response to #Johannes Rössel comment, I removed generics from the solution to eek out a few more bytes.
Edit: Another change, got rid of all temporary variables.
public static String E(String i)
{
return new Regex("HAA|AAH|HH|AA|A|H").Replace(i,
m => (String)new Hashtable {
{ "H", "AH" },
{ "A", "HA" },
{ "AA", "HA" },
{ "HH", "AH" },
{ "AAH", "HA" },
{ "HAA", "AH" }
}[m.Value]);
}
The rewritten solution with less whitespace, that still compiles, is 158 characters:
return new Regex("HAA|AAH|HH|AA|A|H").Replace(i,m =>(String)new Hashtable{{"H","AH"},{"A","HA"},{"AA","HA"},{"HH","AH"},{"AAH","HA"},{"HAA","AH"}}[m.Value]);
For a complete source code solution for Visual Studio 2008, a subversion repository with the necessary code, including unit tests, is available below.
Repository is here, username and password are both 'guest', without the quotes.
Ruby
This code golf is not very well specified -- I assumed that function returning n-th iteration string is best way to solve it. It has 80 characters.
def f n
a='h'
n.times{a.gsub!(/(h(h|aa)?)|(a(ah?)?)/){$1.nil?? "ha":"ah"}}
a
end
Code printing out n first strings (71 characters):
a='h';n.times{puts a.gsub!(/(h(h|aa)?)|(a(ah?)?)/){$1.nil?? "ha":"ah"}}
Erlang
241 bytes and ready to run:
> erl -noshell -s g i -s init stop
AHAHHAAHAHHA
-module(g).
-export([i/0]).
c("HAA"++T)->"AH"++c(T);
c("AAH"++T)->"HA"++c(T);
c("HH"++T)->"AH"++c(T);
c("AA"++T)->"HA"++c(T);
c("A"++T)->"HA"++c(T);
c("H"++T)->"AH"++c(T);
c([])->[].
i(0,L)->L;
i(N,L)->i(N-1,c(L)).
i()->io:format(i(9,"H"))
Could probably be improved.
Perl 168 characters.
(not counting unnecessary newlines)
perl -E'
($s,%m)=qw[H H AH A HA AA HA HH AH AAH HA HAA AH];
sub p{say qq[n = $_[0] | $_[1]]};p(0,$s);
for(1..9){$s=~s/(H(AA|H)?|A(AH?)?)/$m{$1}/g;p($_,$s)}
say q[n = ...]'
De-obfuscated:
use strict;
use warnings;
use 5.010;
my $str = 'H';
my %map = (
H => 'AH',
A => 'HA',
AA => 'HA',
HH => 'AH',
AAH => 'HA',
HAA => 'AH'
);
sub prn{
my( $n, $str ) = #_;
say "n = $n | $str"
}
prn( 0, $str );
for my $i ( 1..9 ){
$str =~ s(
(
H(?:AA|H)? # HAA | HH | H
|
A(?:AH?)? # AAH | AA | A
)
){
$map{$1}
}xge;
prn( $i, $str );
}
say 'n = ...';
Perl 150 characters.
(not counting unnecessary newlines)
perl -E'
$s="H";
sub p{say qq[n = $_[0] | $_[1]]};p(0,$s);
for(1..9){$s=~s/(?|(H)(?:AA|H)?|(A)(?:AH?)?)/("H"eq$1?"A":"H").$1/eg;p($_,$s)}
say q[n = ...]'
De-obfuscated
#! /usr/bin/env perl
use strict;
use warnings;
use 5.010;
my $str = 'H';
sub prn{
my( $n, $str ) = #_;
say "n = $n | $str"
}
prn( 0, $str );
for my $i ( 1..9 ){
$str =~ s{(?|
(H)(?:AA|H)? # HAA | HH | H
|
(A)(?:AH?)? # AAH | AA | A
)}{
( 'H' eq $1 ?'A' :'H' ).$1
}egx;
prn( $i, $str );
}
say 'n = ...';
Python (150 bytes)
import re
N = 10
s = "H"
for n in range(N):
print "n = %d |"% n, s
s = re.sub("(HAA|HH|H)|AAH|AA|A", lambda m: m.group(1) and "AH" or "HA",s)
Output
n = 0 | H
n = 1 | AH
n = 2 | HAAH
n = 3 | AHAH
n = 4 | HAAHHAAH
n = 5 | AHAHHA
n = 6 | HAAHHAAHHA
n = 7 | AHAHHAAHHA
n = 8 | HAAHHAAHHAAHHA
n = 9 | AHAHHAAHAHHA
Here is a very simple C++ version:
#include <iostream>
#include <sstream>
using namespace std;
#define LINES 10
#define put(t) s << t; cout << t
#define r1(o,a,c0) \
if(c[0]==c0) {put(o); s.unget(); s.unget(); a; continue;}
#define r2(o,a,c0,c1) \
if(c[0]==c0 && c[1]==c1) {put(o); s.unget(); a; continue;}
#define r3(o,a,c0,c1,c2) \
if(c[0]==c0 && c[1]==c1 && c[2]==c2) {put(o); a; continue;}
int main() {
char c[3];
stringstream s;
put("H\n\n");
for(int i=2;i<LINES*2;) {
s.read(c,3);
r3("AH",,'H','A','A');
r3("HA",,'A','A','H');
r2("AH",,'H','H');
r2("HA",,'A','A');
r1("HA",,'A');
r1("AH",,'H');
r1("\n",i++,'\n');
}
}
It's not exactly code-golf (it could be made a lot shorter), but it works. Change LINES to however many lines you want printed (note: it will not work for 0). It will print output like this:
H
AH
HAAH
AHAH
HAAHHAAH
AHAHHA
HAAHHAAHHA
AHAHHAAHHA
HAAHHAAHHAAHHA
AHAHHAAHAHHA
ANSI C99
Coming in at a brutal 306 characters:
#include <stdio.h>
#include <string.h>
char s[99]="H",t[99]={0};int main(){for(int n=0;n<10;n++){int i=0,j=strlen(s);printf("n = %u | %s\n",n,s);strcpy(t,s);s[0]=0;for(;i<j;){if(t[i++]=='H'){t[i]=='H'?i++:t[i+1]=='A'?i+=2:1;strcat(s,"AH");}else{t[i]=='A'?i+=1+(t[i+1]=='H'):1;strcat(s,"HA");}}}return 0;}
There are too many nested ifs and conditional operators for me to effectively reduce this with macros. Believe me, I tried. Readable version:
#include <stdio.h>
#include <string.h>
char s[99] = "H", t[99] = {0};
int main()
{
for(int n = 0; n < 10; n++)
{
int i = 0, j = strlen(s);
printf("n = %u | %s\n", n, s);
strcpy(t, s);
s[0] = 0;
/*
* This was originally just a while() loop.
* I tried to make it shorter by making it a for() loop.
* I failed.
* I kept the for() loop because it looked uglier than a while() loop.
* This is code golf.
*/
for(;i<j;)
{
if(t[i++] == 'H' )
{
// t[i] == 'H' ? i++ : t[i+1] == 'A' ? i+=2 : 1;
// Oh, ternary ?:, how do I love thee?
if(t[i] == 'H')
i++;
else if(t[i+1] == 'A')
i+= 2;
strcat(s, "AH");
}
else
{
// t[i] == 'A' ? i += 1 + (t[i + 1] == 'H') : 1;
if(t[i] == 'A')
if(t[++i] == 'H')
i++;
strcat(s, "HA");
}
}
}
return 0;
}
I may be able to make a shorter version with strncmp() in the future, but who knows? We'll see what happens.
In python:
def l(s):
H=['HAA','HH','H','AAH','AA','A']
L=['AH']*3+['HA']*3
for i in [3,2,1]:
if s[:i] in H: return L[H.index(s[:i])]+l(s[i:])
return s
def a(n,s='H'):
return s*(n<1)or a(n-1,l(s))
for i in xrange(0,10):
print '%d: %s'%(i,a(i))
First attempt: 198 char of code, I'm sure it can get smaller :D
REBOL, 150 characters. Unfortunately REBOL is not a language conducive to code golf, but 150 characters ain't too shabby, as Adam Sandler says.
This assumes the loop variable m has already been defined.
s: "H" r: "" z:[some[["HAA"|"HH"|"H"](append r "AH")|["AAH"|"AA"|"A"](append r "HA")]to end]repeat n m[clear r parse s z print["n =" n "|" s: copy r]]
And here it is with better layout:
s: "H"
r: ""
z: [
some [
[ "HAA" | "HH" | "H" ] (append r "AH")
| [ "AAH" | "AA" | "A" ] (append r "HA")
]
to end
]
repeat n m [
clear r
parse s z
print ["n =" n "|" s: copy r]
]
F#: 184 chars
Seems to map pretty cleanly to F#:
type grammar = H | A
let rec laugh = function
| 0,l -> l
| n,l ->
let rec loop = function
|H::A::A::x|H::H::x|H::x->A::H::loop x
|A::A::H::x|A::A::x|A::x->H::A::loop x
|x->x
laugh(n-1,loop l)
Here's a run in fsi:
> [for a in 0 .. 9 -> a, laugh(a, [H])] |> Seq.iter (fun (a, b) -> printfn "n = %i: %A" a b);;
n = 0: [H]
n = 1: [A; H]
n = 2: [H; A; A; H]
n = 3: [A; H; A; H]
n = 4: [H; A; A; H; H; A; A; H]
n = 5: [A; H; A; H; H; A]
n = 6: [H; A; A; H; H; A; A; H; H; A]
n = 7: [A; H; A; H; H; A; A; H; H; A]
n = 8: [H; A; A; H; H; A; A; H; H; A; A; H; H; A]
n = 9: [A; H; A; H; H; A; A; H; A; H; H; A]