How to program Lex and Yacc to parse a partial file - yacc

Let me tell with an example.
Suppose the contents of a text file are as follows:
function fun1 {
int a, b, c;
function fun2 {
int d, e;
char f g;
function fun3 {
int h, i;
}
}
In the above text file, the number of opening braces are not matching the number of closing braces. The file as a whole doesn't follow the syntax. However the partial functions fun2 and fun3 follows the syntax. Typically the text file is very large.
If the user wants to parse the entire file ie function fun1, then the program should output an error as the braces are not matching. However, if the user wants to parse only the partial file ie function fun2/fun3, then the program shouldn't throw out an error as the braces are matching.
I have a question now
1. Is there a way to let the Lex and Yacc load only a
partial file ? If so then how it needs to be done.

Are you using bison/flex or plain old yacc/lex ?
It's a long time I played with yacc.
The technical answer is different for both pair of tool.
With flex you'll have to deal with the buffer mechanism.
The final code will be cleaner.
With lex you'll have to do all by hand.
At least you have to redefine input and unput macro.
You can also try to play with yyin and fseek.
On the parser side you'll have to deal with error management (yyerrok macro) and error token
http://dinosaur.compilertools.net/bison/bison_9.html#SEC81

Related

Why can't I use yytext inside yyerror (Yacc)

Im having some trouble with my analizer. I.m trying to use yytext inside my yyerror but it shows me this error, can you help me?
You can't use yytext in your parser because it is defined by the lexer.
Indeed, you normally shouldn't use yytext in your parser because its value is not meaningful to the parse. Your attempt to use it to provide context in error messages is just about the only reasonable use, and even then there is a certain ambiguity because you can't tell whether the erroneous token is the one currently in yytext or the previous token, which was overwritten when the parser obtained its lookahead token.
In any case, if you want to refer to yytext inside your parser, you'll need to declare it, which will normally require putting
extern char* yytext;
into your bison grammar file. Since the only place you can reasonably use yytext is yyerror, you might change the definition of that function to:
void yyerror(const char* msg) {
extern char* yytext;
fprintf(stderr, "%s at line %d near '%s'\n", msg, nLineas, yytext);
}
Note that you can get flex to track line numbers automatically, so you don't need to track your own nLineas variable. Just add
%option yylineno
at the top of your flex file, and the global variable yylineno will automatically be maintained during lexical analysis. If you want to use yylineno in your parser, you'll need to add an an extern declaration for it as well:
extern int yylineno;
Again, using yylineno in the parser may be imprecise because it might refer to the line number of the token following the error, which might be on a different line from the error (and might even be separated from the error by many lines of comments).
As an alternative to using external declarations of yytext and yylineno, you are free to put the implementation of yyerror inside the scanner definition instead of the grammar definition. Your grammar file should already have a forward declaration of yyerror, so it doesn't matter which file it's placed in. If you put it into the scanner file, global scanner variables will already be declared.

Howto parse runlength encoded binary subformat with antlr

Given the following input:
AA:4:2:#5#xxAAx:2:a:
The part #5# defines the start of a binary subformat with the length of 5. The sub format can contain any kind of character and is likely to contain tokens from the main format. (ex. AA is a keyword/token inside the main format).
I want to build a lexer that is able to extract one token for the whole binary part.
I already tried several approaches (ex. partials, sematic predicates) but I did not get them working together the right way.
Finally I found the solution by myself.
Below are the relevant parts of the lexer definition
#members {
public int _binLength;
}
BINARYHEAD: '#' [0-9]+ '#' { _binLength = Integer.parseInt(getText().substring(1,getText().length()-1)); } -> pushMode(RAW) ;
mode RAW;
BINARY: .+ {getText().length() <= _binLength}? -> popMode;
The solution is based on an extra field that set while parsing the length definition of the binary field. Afterward a semantic predicate is used to restrict the validity of the binary content to the size of that field.
Any suggestion to simplify the parseInt call is welcome.

yacc lex when parsing CNC GCODES

I have to parse motion control programs (CNC machines, GCODE)
It is GCODE plus similar looking code specific to hardware.
There are lots of commands that consist of a single letter and number, example:
C100Z0.5C100Z-0.5
C80Z0.5C80Z-0.5
So part of my (abreviated) lex (racc & rex actually) looks like:
A {[:A,text]}
B {[:B,text]}
...
Z {[:Z,text]}
So I find a command that takes ANY letter as an argument, and in racc started typing:
letter : A
| B
| C
......
Then I stopped, I haven't used yacc is 30 years, is there some kind of shortcut for the above? Have I gone horribly off course?
It is not clear what are you trying to accomplish. If you want to create Yacc rule that covers all letters you could create token for that:
%token letter_token
In lex you would find with regular expressions each letter and simply return letter_token:
Regex for letters {
return letter_token;
}
Now you can use letter_token in Yacc rules:
letter : letter_token
Also you haven't said what language you're using. But if you need, you can get specific character you assigned with letter_token, by defining union:
%union {
char c;
}
%token <c> letter_token
Let's say you want to read single characters, Lex part in assigning character to token would be:
[A-Z] {
yylval.c = *yytext;
return letter_token;
}
Feel free to ask any further questions, and read more here about How to create a Minimal, Complete, and Verifiable example.

How can you implement this multiline string literal macro in Swift?

In my Objective-C code for my GPUImage framework, I have the following macro:
#define STRINGIZE(x) #x
#define STRINGIZE2(x) STRINGIZE(x)
#define SHADER_STRING(text) # STRINGIZE2(text)
which allows me to inline multiline vertex and fragment shaders as NSString literals within my custom filter subclasses, like this:
NSString *const kGPUImagePassthroughFragmentShaderString = SHADER_STRING
(
varying highp vec2 textureCoordinate;
uniform sampler2D inputImageTexture;
void main()
{
gl_FragColor = texture2D(inputImageTexture, textureCoordinate);
}
);
GPUImage needs this in order to provide formatted vertex and fragment shaders that are included in the body text of filter subclasses. Shipping them as separate files would make the framework unable to be compiled into a static library. Using the above macro, I can make these shaders able to be copied and pasted between the framework code and external shader files without a ridiculous amount of reformatting work.
Swift does away with compiler macros, and the documentation has this to say:
Complex macros are used in C and Objective-C but have no counterpart
in Swift. Complex macros are macros that do not define constants,
including parenthesized, function-like macros. You use complex macros
in C and Objective-C to avoid type-checking constraints or to avoid
retyping large amounts of boilerplate code. However, macros can make
debugging and refactoring difficult. In Swift, you can use functions
and generics to achieve the same results without any compromises.
Therefore, the complex macros that are in C and Objective-C source
files are not made available to your Swift code.
Per the line "In Swift, you can use functions and generics to achieve the same results without any compromises", is there a way in Swift to provide multiline string literals without resorting to a string of concatenation operations?
Alas Swift multiline strings are still not available, as far as I know. However when doing some research regarding this, I found a workaround which could be useful. It is a combination of these items:
A Quick Hack to Quote Swift Strings in a Playground - Describing how to make an service replacing and fixing texts
The comment by pyrtsa, regarding using "\n".join(...) to emulate the multiline strings
Setup an automated service
Using Automator you could set up an extra service with the following properties:
A single action of "Run Shell Script"
Tick off the "Output replaces selected text"
Change shell to /usr/bin/perl
Add the code excerpt below to the action window
Save as something like "Replace with quoted swift multiline join"
Code excerpt
print "\"\\n\".join([\n"; # Start a join operation
# For each line, reformat and print
while(<>) {
print " "; # A little indentation
chomp; # Loose the newline
s/([\\\"])/\\$1/g; # Replace \ and " with escaped variants
print "\"$_\""; # Add quotes around the line
print "," unless eof # Add a comma, unless it is the last line
print "\n"; # End the line, preserving original line count
}
print " ])"; # Close the join operation
You are of course free to use whatever shell and code you want, I chose perl as that is familiar to me, and here are some comments:
I used the "\n".join(...) version to create the multiline string, you could use the extension answer from Swift - Split string over multiple lines, or even the + variant, I'll leave that as an exercise for the user
I opted for a little indentation with spaces, and to replace the \ and " to make it a little sturdier
Comments are of course optional, and you could probably shorten the code somewhat. I tried to opt for clarity and readability
The code, as is, preserves spaces, but you could be edited if that is not wanted. Also left as an exercise for the user
Usage of service
Open up your playground or code editor, and insert/write some multline text:
Mark the text block
Execute Xcode (or similar) > Services > Replace with quoted swift multiline join
You now have a multiline string in proper swift coding. Here are an example of before and after text:
Here is my multiline text
example with both a " and
a \ within the text
"\n".join([
"Here is my multiline text ",
"example with both a \" and",
"a \\ within the text"
])
It looks like your end goal is to avoid including standalone shader files?
If so one technique would be to write a quick command line utility that generates a .swift file of string constants representing the shader functions in a certain folder.
Include the resulting .swift file in your project and you have no runtime penalty, and even easier debugging if you generate the code nicely.
Would probably take less than an hour, never need macros again for shaders.

How to use { } Curly braces in java-script function to be generated by RPG-CGI pgm

How to write a RPG-CGI program to generate a HTML page which contains a java-script program having function xxx() { aaaaaaaaaaaa; ssssssssss; }. When written in using Hex code constant it is being changed to some other symbol in the actual html code in the browser.
Does EBCDIC character set contains { , }, [ , ] , ! symbols.......if no,then how to use it in AS/400 RPG-CGI program ?
You are most likely running into a codepage conversion issue, which in brief means that the AS/400 does not produce the characters as expected by the recipient. Try to run in code page 819 which is ISO-Latin-1
Another option may be to look into using CGIDEV2 though I would try Thorbjørn's option first.