How to generate flipped text like this - ʍoႨɟɹәʌoʞɔɐʇs - scripting

I want to make a script which can flip the text upside down like if I type
stackoverflow it should make it to - ʍoႨɟɹәʌoʞɔɐʇs.
Many sites do this like this one link text and this one link text

JavaScript Code. It will be easy to convert this code to Java or C#
function flip() {
var result = flipString(document.f.original.value.toLowerCase());
document.f.flipped.value = result;
}
function flipString(aString) {
var last = aString.length - 1;
var result = new Array(aString.length)
for (var i = last; i >= 0; --i) {
var c = aString.charAt(i)
var r = flipTable[c]
result[last - i] = r != undefined ? r : c
}
return result.join('')
}
var flipTable = {
a : '\u0250',
b : 'q',
c : '\u0254', //open o -- from pne
d : 'p',
e : '\u01DD',
f : '\u025F', //from pne
g : '\u0183',
h : '\u0265',
i : '\u0131', //from pne
j : '\u027E',
k : '\u029E',
//l : '\u0283',
m : '\u026F',
n : 'u',
r : '\u0279',
t : '\u0287',
v : '\u028C',
w : '\u028D',
y : '\u028E',
'.' : '\u02D9',
'[' : ']',
'(' : ')',
'{' : '}',
'?' : '\u00BF', //from pne
'!' : '\u00A1',
"\'" : ',',
'<' : '>',
'_' : '\u203E',
';' : '\u061B',
'\u203F' : '\u2040',
'\u2045' : '\u2046',
'\u2234' : '\u2235'
}
for (i in flipTable) {
flipTable[flipTable[i]] = i
}

Related

Add custom attributes to rules

In ANTLR: Is there a simple example?, a question about antlr3, the accepted answer has this grammar:
grammar Exp;
eval returns [double value]
: exp=additionExp {$value = $exp.value;}
;
additionExp returns [double value]
: m1=multiplyExp {$value = $m1.value;}
( '+' m2=multiplyExp {$value += $m2.value;}
| '-' m2=multiplyExp {$value -= $m2.value;}
)*
;
multiplyExp returns [double value]
: a1=atomExp {$value = $a1.value;}
( '*' a2=atomExp {$value *= $a2.value;}
| '/' a2=atomExp {$value /= $a2.value;}
)*
;
atomExp returns [double value]
: n=Number {$value = Double.parseDouble($n.text);}
| '(' exp=additionExp ')' {$value = $exp.value;}
;
Number
: ('0'..'9')+ ('.' ('0'..'9')+)?
;
WS
: (' ' | '\t' | '\r'| '\n') {$channel=HIDDEN;}
It uses the $value attribute to pass information up the parse tree.
I want to do the same thing antlr4. It looks like the $value attribute isn't there any more. How can I add custom attributes to rules to pass information up the parse tree? If that's not the right mechanism to accomplish what I want, what mechanisms are there to accomplish something similar?
I tried using locals, like this:
/* Store each row in an ArrayList */
row
locals [
ArrayList<String> cells = null
]
: partial_row RowSeparator
{
$cells = $partial_row.cells;
}
;
partial_row
locals [
ArrayList<String> cells = null
]
: Cell
{
$cells = new java.util.ArrayList<String>();
$cells.add($Cell.text);
}
| partial_row Cell
{
$cells = $partial_row.cells;
$cells.add($Cell.text);
}
;
But this doesn't work, giving me this error:
error(65): csce322a1p2.g:70:24: unknown attribute 'cells' for rule 'partial_row' in '$partial_row.cells'
I think you are looking for "returns" not locals. Also that should work. My test works:
row
locals [
ArrayList cells = null
]
: A B
{
$cells = $A;
}
;
Instead of locals, I want to use #init and returns:
row returns [java.util.ArrayList<String> cells]
#init {
java.util.ArrayList<String> cells = null;
}
: partial_row RowSeparator
{
$cells = $partial_row.cells;
}
;
partial_row returns [java.util.ArrayListArrayList<String> cells]
#init {
java.util.ArrayListArrayList<String> cells = null;
}
: Cell
{
$cells = new java.util.ArrayList<String>();
$cells.add($Cell.text);
}
| partial_row Cell
{
$cells = $partial_row.cells;
$cells.add($Cell.text);
}
;

Using Tree Walker with Boolean checks + capturing the whole expression

I have actually two questions that I hope can be answered as they are semi-dependent on my work. Below is the grammar + tree grammar + Java test file.
What I am actually trying to achieve is the following:
Question 1:
I have a grammar that parses my language correctly. I would like to do some semantic checks on variable declarations. So I created a tree walker and so far it semi works. My problem is it's not capturing the whole string of expression. For example,
float x = 10 + 10;
It is only capturing the first part, i.e. 10. I am not sure what I am doing wrong. If I did it in one pass, it works. Somehow, if I split the work into a grammar and tree grammar, it is not capturing the whole string.
Question 2:
I would like to do a check on a rule such that if my conditions returns true, I would like to remove that subtree. For example,
float x = 10;
float x; // <================ I would like this to be removed.
I have tried using rewrite rules but I think it is more complex than that.
Test.g:
grammar Test;
options {
language = Java;
output = AST;
}
parse : varDeclare+
;
varDeclare : type id equalExp? ';'
;
equalExp : ('=' (expression | '...'))
;
expression : binaryExpression
;
binaryExpression : addingExpression (('=='|'!='|'<='|'>='|'>'|'<') addingExpression)*
;
addingExpression : multiplyingExpression (('+'|'-') multiplyingExpression)*
;
multiplyingExpression : unaryExpression
(('*'|'/') unaryExpression)*
;
unaryExpression: ('!'|'-')* primitiveElement;
primitiveElement : literalExpression
| id
| '(' expression ')'
;
literalExpression : INT
;
id : IDENTIFIER
;
type : 'int'
| 'float'
;
// L E X I C A L R U L E S
INT : DIGITS ;
IDENTIFIER : LETTER (LETTER | DIGIT)*;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
fragment LETTER : ('a'..'z' | 'A'..'Z' | '_') ;
fragment DIGITS: DIGIT+;
fragment DIGIT : '0'..'9';
TestTree.g:
tree grammar TestTree;
options {
language = Java;
tokenVocab = Test;
ASTLabelType = CommonTree;
}
#members {
SemanticCheck s;
public TestTree(TreeNodeStream input, SemanticCheck s) {
this(input);
this.s = s;
}
}
parse[SemanticCheck s]
: varDeclare+
;
varDeclare : type id equalExp? ';'
{s.check($type.name, $id.text, $equalExp.expr);}
;
equalExp returns [String expr]
: ('=' (expression {$expr = $expression.e;} | '...' {$expr = "...";}))
;
expression returns [String e]
#after {$e = $expression.text;}
: binaryExpression
;
binaryExpression : addingExpression (('=='|'!='|'<='|'>='|'>'|'<') addingExpression)*
;
addingExpression : multiplyingExpression (('+'|'-') multiplyingExpression)*
;
multiplyingExpression : unaryExpression
(('*'|'/') unaryExpression)*
;
unaryExpression: ('!'|'-')* primitiveElement;
primitiveElement : literalExpression
| id
| '(' expression ')'
;
literalExpression : INT
;
id : IDENTIFIER
;
type returns [String name]
#after { $name = $type.text; }
: 'int'
| 'float'
;
Java test file, Test.java:
import java.util.ArrayList;
import java.util.List;
import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.RuleReturnScope;
import org.antlr.runtime.tree.CommonTree;
import org.antlr.runtime.tree.CommonTreeNodeStream;
public class Test {
public static void main(String[] args) throws Exception {
SemanticCheck s = new SemanticCheck();
String src =
"float x = 10+y; \n" +
"float x; \n";
TestLexer lexer = new TestLexer(new ANTLRStringStream(src));
//TestLexer lexer = new TestLexer(new ANTLRFileStream("input.txt"));
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokenStream);
RuleReturnScope r = parser.parse();
System.out.println("Parse Tree:\n" + tokenStream.toString());
CommonTree t = (CommonTree)r.getTree();
CommonTreeNodeStream nodes = new CommonTreeNodeStream(t);
nodes.setTokenStream(tokenStream);
TestTree walker = new TestTree(nodes, s);
walker.parse(s);
}
}
class SemanticCheck {
List<String> names;
public SemanticCheck() {
this.names = new ArrayList<String>();
}
public boolean check(String type, String variableName, String exp) {
System.out.println("Type: " + type + " variableName: " + variableName + " exp: " + exp);
if(names.contains(variableName)) {
System.out.println("Remove statement! Already defined!");
return true;
}
names.add(variableName);
return false;
}
}
Thanks in advance!
I figured out my problem and it turns out I needed to build an AST first before I can do anything. This would help in understanding what is a flat tree look like vs building an AST.
How to output the AST built using ANTLR?
Thanks to Bart's endless examples here in StackOverFlow, I was able to do semantic predicates to do what I needed in the example above.
Below is the updated code:
Test.g
grammar Test;
options {
language = Java;
output = AST;
}
tokens {
VARDECL;
Assign = '=';
EqT = '==';
NEq = '!=';
LT = '<';
LTEq = '<=';
GT = '>';
GTEq = '>=';
NOT = '!';
PLUS = '+';
MINUS = '-';
MULT = '*';
DIV = '/';
}
parse : varDeclare+
;
varDeclare : type id equalExp ';' -> ^(VARDECL type id equalExp)
;
equalExp : (Assign^ (expression | '...' ))
;
expression : binaryExpression
;
binaryExpression : addingExpression ((EqT|NEq|LTEq|GTEq|LT|GT)^ addingExpression)*
;
addingExpression : multiplyingExpression ((PLUS|MINUS)^ multiplyingExpression)*
;
multiplyingExpression : unaryExpression
((MULT|DIV)^ unaryExpression)*
;
unaryExpression: ((NOT|MINUS))^ primitiveElement
| primitiveElement
;
primitiveElement : literalExpression
| id
| '(' expression ')' -> expression
;
literalExpression : INT
;
id : IDENTIFIER
;
type : 'int'
| 'float'
;
// L E X I C A L R U L E S
INT : DIGITS ;
IDENTIFIER : LETTER (LETTER | DIGIT)*;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
fragment LETTER : ('a'..'z' | 'A'..'Z' | '_') ;
fragment DIGITS: DIGIT+;
fragment DIGIT : '0'..'9';
This should automatically build an AST whenever you have varDeclare. Now on to the tree grammar/walker.
TestTree.g
tree grammar TestTree;
options {
language = Java;
tokenVocab = Test;
ASTLabelType = CommonTree;
output = AST;
}
tokens {
REMOVED;
}
#members {
SemanticCheck s;
public TestTree(TreeNodeStream input, SemanticCheck s) {
this(input);
this.s = s;
}
}
start[SemanticCheck s] : varDeclare+
;
varDeclare : ^(VARDECL type id equalExp)
-> {s.check($type.text, $id.text, $equalExp.text)}? REMOVED
-> ^(VARDECL type id equalExp)
;
equalExp : ^(Assign expression)
| ^(Assign '...')
;
expression : ^(('!') expression)
| ^(('+'|'-'|'*'|'/') expression expression*)
| ^(('=='|'<='|'<'|'>='|'>'|'!=') expression expression*)
| literalExpression
;
literalExpression : INT
| id
;
id : IDENTIFIER
;
type : 'int'
| 'float'
;
Now on to test it:
Test.java
import java.util.ArrayList;
import java.util.List;
import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;
import org.antlr.runtime.tree.*;
public class Test {
public static void main(String[] args) throws Exception {
SemanticCheck s = new SemanticCheck();
String src =
"float x = 10; \n" +
"int x = 1; \n";
TestLexer lexer = new TestLexer(new ANTLRStringStream(src));
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokenStream);
TestParser.parse_return r = parser.parse();
System.out.println("Tree:" + ((Tree)r.tree).toStringTree() + "\n");
CommonTreeNodeStream nodes = new CommonTreeNodeStream((Tree)r.tree);
nodes.setTokenStream(tokenStream);
TestTree walker = new TestTree(nodes, s);
TestTree.start_return r2 = walker.start(s);
System.out.println("\nTree Walker: "+((Tree)r2.tree).toStringTree());
}
}
class SemanticCheck {
List<String> names;
public SemanticCheck() {
this.names = new ArrayList<String>();
}
public boolean check(String type, String variableName, String exp) {
System.out.println("Type: " + type + " variableName: " + variableName + " exp: " + exp);
if(names.contains(variableName)) {
return true;
}
names.add(variableName);
return false;
}
}
Output:
Tree:(VARDECL float x (= 10)) (VARDECL int x (= 1))
Type: float variableName: x exp: = 10
Type: int variableName: x exp: = 1
Tree Walker: (VARDECL float x (= 10)) REMOVED
Hope this helps! Please feel free to point any errors if I did something wrong.

Antlr setText not working in the way I expected

I have a requirement to convert an identifier into a beanutil string for retrieving an item from an object. The the identifiers to string conversions look like:
name ==> name
attribute.name ==> attributes(name)[0].value
attribute.name[2] ==> attributes(name)[2].value
address.attribute.postalcode ==> contactDetails.addresses[0].attributes(postalcode)[0].value
address[2].attribute.postalcode ==> contactDetails.addresses[2].attributes(postalcode)[0].value
address[2].attribute.postalcode[3] ==> contactDetails.addresses[2].attributes(postalcode)[3].value
Now I have decided to do this using antlr as I feel its probably going to be just as quick as using a set of 'if' statements. Feel free to tell me I'm wrong.
Right now, I've got this partial working using antlr, however once I start doing the 'address' ones, the setText part seems to stop working for Attribute.
Am I doing this the correct way or is there a better way of using antlr to get the result I want?
grammar AttributeParser;
parse returns [ String result ]
: Address EOF { $result = $Address.text; }
| Attribute EOF { $result = $Attribute.text; }
| Varname EOF { $result = $Varname.text; }
;
Address
: 'address' (Arraypos)* '.' Attribute { setText("contactDetails.addresses" + ($Arraypos == null ? "[0]" : $Arraypos.text ) + "." + $Attribute.text); }
;
Attribute
: 'attribute.' Varname (Arraypos)* { setText("attributes(" + $Varname.text + ")" + ($Arraypos == null ? "[0]" : $Arraypos.text ) + ".value"); }
;
Arraypos
: '[' Number+ ']'
;
Varname
: ('a'..'z'|'A'..'Z')+
;
Number
: '0'..'9'+
;
Spaces
: (' ' | '\t' | '\r' | '\n')+ { setText(" "); }
;
Below are two unit tests, the first returns what I expect, the second doesn't.
#Test
public void testSimpleAttributeWithArrayRef() throws Exception {
String source = "attribute.name[2]";
ANTLRStringStream in = new ANTLRStringStream(source);
AttributeParserLexer lexer = new AttributeParserLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
AttributeParserParser parser = new AttributeParserParser(tokens);
String result = parser.parse();
assertEquals("attributes(name)[2].value", result);
}
#Test
public void testAddress() throws Exception {
String source = "address.attribute.postalcode";
ANTLRStringStream in = new ANTLRStringStream(source);
AttributeParserLexer lexer = new AttributeParserLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
AttributeParserParser parser = new AttributeParserParser(tokens);
String result = parser.parse();
System.out.println("Result: " + result);
assertEquals("contactDetails.addresses[0].attributes(postalcode)[0].value", result);
}
No, you can't do (Arraypos)* and then refer to the contents as this: $Arraypos.text.
I wouldn't go changing the inner text of the tokens, but create a couple of parser rules and let them return the appropriate text.
A little demo:
grammar AttributeParser;
parse returns [String s]
: input EOF {$s = $input.s;}
;
input returns [String s]
: address {$s = $address.s;}
| attribute {$s = $attribute.s;}
| Varname {$s = $Varname.text;}
;
address returns [String s]
: Address arrayPos '.' attribute
{$s = "contactDetails.addresses" + $arrayPos.s + "." + $attribute.s;}
;
attribute returns [String s]
: Attribute '.' Varname arrayPos
{$s = "attributes(" + $Varname.text + ")" + $arrayPos.s + ".value" ;}
;
arrayPos returns [String s]
: Arraypos {$s = $Arraypos.text;}
| /* nothing */ {$s = "[0]";}
;
Attribute : 'attribute';
Address : 'address';
Arraypos : '[' '0'..'9'+ ']';
Varname : ('a'..'z' | 'A'..'Z')+;
which can be tested with:
import org.antlr.runtime.*;
public class Main {
public static void main(String[] args) throws Exception {
String[][] tests = {
{"name", "name"},
{"attribute.name", "attributes(name)[0].value"},
{"attribute.name[2]", "attributes(name)[2].value"},
{"address.attribute.postalcode", "contactDetails.addresses[0].attributes(postalcode)[0].value"},
{"address[2].attribute.postalcode", "contactDetails.addresses[2].attributes(postalcode)[0].value"},
{"address[2].attribute.postalcode[3]", "contactDetails.addresses[2].attributes(postalcode)[3].value"}
};
for(String[] test : tests) {
String input = test[0];
String expected = test[1];
AttributeParserLexer lexer = new AttributeParserLexer(new ANTLRStringStream(input));
AttributeParserParser parser = new AttributeParserParser(new CommonTokenStream(lexer));
String output = parser.parse();
if(!output.equals(expected)) {
throw new RuntimeException(output + " != " + expected);
}
System.out.printf("in = %s\nout = %s\n\n", input, output, expected);
}
}
}
And to run the demo do:
java -cp antlr-3.3.jar org.antlr.Tool AttributeParser.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main
which will print the following to the console:
in = name
out = name
in = attribute.name
out = attributes(name)[0].value
in = attribute.name[2]
out = attributes(name)[2].value
in = address.attribute.postalcode
out = contactDetails.addresses[0].attributes(postalcode)[0].value
in = address[2].attribute.postalcode
out = contactDetails.addresses[2].attributes(postalcode)[0].value
in = address[2].attribute.postalcode[3]
out = contactDetails.addresses[2].attributes(postalcode)[3].value
EDIT
Note that you can also let parser rules return more than just one object like this:
bar
: foo {System.out.println($foo.text + ", " + $foo.number);}
;
foo returns [String text, int number]
: 'FOO' {$text = "a"; $number = 1;}
| 'foo' {$text = "b"; $number = 2;}
;

Generating simple AST in ANTLR

I'm playing a bit around with ANTLR, and wish to create a function like this:
MOVE x y z pitch roll
That produces the following AST:
MOVE
|---x
|---y
|---z
|---pitch
|---roll
So far I've tried without luck, and I keep getting the AST to have the parameters as siblings, rather than children.
Code so far:
C#:
class Program
{
const string CRLF = "\r\n";
static void Main(string[] args)
{
string filename = "Script.txt";
var reader = new StreamReader(filename);
var input = new ANTLRReaderStream(reader);
var lexer = new ScorBotScriptLexer(input);
var tokens = new CommonTokenStream(lexer);
var parser = new ScorBotScriptParser(tokens);
var result = parser.program();
var tree = result.Tree as CommonTree;
Print(tree, "");
Console.Read();
}
static void Print(CommonTree tree, string indent)
{
Console.WriteLine(indent + tree.ToString());
if (tree.Children != null)
{
indent += "\t";
foreach (var child in tree.Children)
{
var childTree = child as CommonTree;
if (childTree.Text != CRLF)
{
Print(childTree, indent);
}
}
}
}
ANTLR:
grammar ScorBotScript;
options
{
language = 'CSharp2';
output = AST;
ASTLabelType = CommonTree;
backtrack = true;
memoize = true;
}
#parser::namespace { RSD.Scripting }
#lexer::namespace { RSD.Scripting }
program
: (robotInstruction CRLF)*
;
robotInstruction
: moveCoordinatesInstruction
;
/**
* MOVE X Y Z PITCH ROLL
*/
moveCoordinatesInstruction
: 'MOVE' x=INT y=INT z=INT pitch=INT roll=INT
;
INT : '-'? ( '0'..'9' )*
;
COMMENT
: '//' ~( CR | LF )* CR? LF { $channel = HIDDEN; }
;
WS
: ( ' ' | TAB | CR | LF ) { $channel = HIDDEN; }
;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')*
;
STRING
: '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
;
fragment
ESC_SEQ
: '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
;
fragment TAB
: '\t'
;
fragment CR
: '\r'
;
fragment LF
: '\n'
;
CRLF
: (CR ? LF) => CR ? LF
| CR
;
parse
: ID
| INT
| COMMENT
| STRING
| WS
;
I'm a beginner with ANTLR myself, this confused me too.
I think if you want to create a tree from your grammar that has structure, you augment your grammar with hints using the ^ and ! characters. This examples page shows how.
From the linked page:
By default ANTLR creates trees as
"sibling lists".
The grammar must be annotated to with
tree commands to produce a parser that
creates trees in the correct shape
(that is, operators at the root, which
operands as children). A somewhat more
complicated expression parser can be
seen here and downloaded in tar form
here. Note that grammar terminals
which should be at the root of a
sub-tree are annotated with ^.

ANTLR: multiplication omiting '*' symbol

I'm trying to create a grammar for multiplying and dividing numbers in which the '*' symbol does not need to be included. I need it to output an AST. So for input like this:
1 2 / 3 4
I want the AST to be
(* (/ (* 1 2) 3) 4)
I've hit upon the following, which uses java code to create the appropriate nodes:
grammar TestProd;
options {
output = AST;
}
tokens {
PROD;
}
DIV : '/';
multExpr: (INTEGER -> INTEGER)
( {div = null;}
div=DIV? b=INTEGER
->
^({$div == null ? (Object)adaptor.create(PROD, "*") : (Object)adaptor.create(DIV, "/")}
$multExpr $b))*
;
INTEGER: ('0' | '1'..'9' '0'..'9'*);
WHITESPACE: (' ' | '\t')+ { $channel = HIDDEN; };
This works. But is there a better/simpler way?
Here's a way:
grammar Test;
options {
backtrack=true;
output=AST;
}
tokens {
MUL;
DIV;
}
parse
: expr* EOF
;
expr
: (atom -> atom)
( '/' a=atom -> ^(DIV $expr $a)
| a=atom -> ^(MUL $expr $a)
)*
;
atom
: Number
| '(' expr ')' -> expr
;
Number
: '0'..'9'+
;
Space
: (' ' | '\t' | '\r' | '\n') {skip();}
;
Tested with:
import org.antlr.runtime.*;
import org.antlr.runtime.tree.Tree;
public class Main {
public static void main(String[] args) throws Exception {
String source = "1 2 / 3 4";
ANTLRStringStream in = new ANTLRStringStream(source);
TestLexer lexer = new TestLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
TestParser parser = new TestParser(tokens);
TestParser.parse_return result = parser.parse();
Tree tree = (Tree)result.getTree();
System.out.println(tree.toStringTree());
}
}
produced:
(MUL (DIV (MUL 1 2) 3) 4)