What is ANTLR3 error recovery method? - error-handling

This seems to be a theoretical question.
As I far as I know ANTLR3 handles errors itself using its recover(###) method. I want to know what the method ANTLR3 uses for error recovery. (i.e. panic-mode/phrase-level etc.) Can someone help me figure this out?
It would be nice if someone can show me the declaration of its recover method, if my first guess is correct. Thank you.

Quote:
ANTLR’s error recovery mechanism is based upon Niklaus Wirth’s early
ideas in Algorithms + Data Structures = Programs 1 (as well as
Rodney Topor’s A Note on Error Recovery in Recursive Descent Parsers
2) but also includes Josef Grosch’s good ideas from his CoCo
parser generator (Efficient and Comfortable Error Recovery in Recur-
sive Descent Parsers 3). Essentially, recognizers perform single-
symbol insertion and deletion upon mismatched symbol errors (as
described in a moment) if possible. If not, recognizers gobble up sym-
bols until the lookahead is a member of the resynchronization set and
then exit the rule. The resynchronization set is the set of input symbols
that can legally follow references to the current rule and references to
any invoking rules up the call chain. Similarly, if the recognizer cannot
choose any of the alternatives from the start of a rule, the recognizer
again uses the gobble-and-exit strategy.
[...]
-- Terence Parr. The Definitive ANTLR Reference, 10.7 Automatic Error Recovery Strategy.
References
1 Niklaus Wirth. Algorithms + Data Structures = Programs. Prentice Hall PTR, Upper Saddle River, NJ, USA, 1978.
2 Rodney W. Topor. A note on error recovery in recursive descent parsers. SIGPLAN Not., 17(2):37–40, 1982.
3 Josef Grosch. Efficient and comfortable error recovery in recursive descent parsers. Structured Programming, 11(3):129–140, 1990.

Related

Is this correct NFA graph?

Task: Build NFA from a given regular expression.
I decided to push some of my old programs to GitHub. Specifically problems regarding Theory of formal languages. After testing code I had this result and I can't really tell if this a wrong or correct output. It is kindaaa looks right but not something Thompson's algo would output. Also those little loops look suspicious. They basically do nothing though.
Definitely wrong.
The epsilon-self-loops look to me like a bug in the handling of the union operator. There should be an epsilon transition from each end state in the union to a new end state, so my guess is that you have mixed up the epsilon links. I'm not sure how you end up with the correct epsilon transition on a in one case and b in the other, so perhaps the bug is more complicated.
You're right that in this case, there is no harm in the epsilon self-loop. But it is quite possible that the absence of an epsilon link from the end of the union leg to the union's end state will cause a problem with (a*|b) or (a|b*). One of those might actually turn out to recognize (a|b)+.
Also, your Kleene star implementation does not allow zero repetitions. What you have is (a|b)+, not (a|b)*, because there is no epsilon transition from the start state to the state of the star subconstruction.
My C# implementation of Brzozowski's algorithm for DFA minimization gives the DFA below. (0) is initial state, (2) and (3) are final states.

How to add a new syntax element in HM (HEVC test Model)

I've been working on the HM reference software for a while, to improve something in the intra prediction part. Now a new intra prediction algorithm is added to the code and I let the encoder choose between my algorithm and the default algorithm of HM (according to the RDCost of course).
What I need now, is to signal a flag for each PU, so that the decoder will be able to perform the same algorithm as the encoder decides in the rate distortion loop.
I want to know what exactly should I do to properly add this one bit flag to the stream, without breaking anything in the code.
Assuming that I want to use a CABAC context model to keep the track of my flag's statistics, what else should I do:
adding a new context model like ContextModel3DBuffer m_cCUIntraAlgorithmSCModel to the TEncSbac.h file.
properly initializing the model (both at encoder and decoder side) by looking at how the HM initialezes other context models.
calling the function m_pcBinIf->encodeBin(myFlag, cCUIntraAlgorithmSCModel) and m_pcTDecBinIfdecodeBin(myFlag, cCUIntraAlgorithmSCModel) at the encoder side and decoder side, respectively.
I take these three steps but apparently it breaks something.
PS: Even an equiprobable signaling (i.e. without using CABAC contexts) will be useful. I just want to send this flag peacefully!
Thanks in advance.
I could solve this problem finally. It was a bug in the CABAC context initialization.
But I want to share this experience as many people may want to do the same thing.
The three steps that I explained are essentially necessary to add a new syntax element, but one might be very careful with the followings:
In the beginning, you need to decide either you want to use a separate context model for your syntax element? Or you want to use an existing one? In case of CABAC separation, you should define a ContextModel3DBuffer and the best way to do that is: finding a similar syntax element in the code; then duplicating its ``ContextModel3DBuffer'' definition and ALL of its occurences in the code. This way assures that you are considering everything.
Encoding of each syntax elements happens in two different places: first, in the RDO loop to make a "decision", and second, during the actual encoding phase and when the decisions are being encoded (e.g. encodeCtu function).
The order of encoding/decoding syntaxt elements should be the same at the encoder/decoder sides. For example if your new syntax element is encoded after splitFlag and before predMode at the encoder side, you should decode it exactly between splitFlag and predMode at the decoder side.
The context model is implemented as a 3D matrix in order to let track the statistics of syntaxt elements separately for different block sizes, componenets etc. This means that when you want to call the function encodeBin, you may make sure that a correct index is being used. I've made stupid mistakes in this part!
Apart from the above remarks, I found a the function getState very useful for debugging. This function returns the state of your CABAC context model in an arbitrary place of the code when you have access to it. It is very useful to compare the state at the same place of the encoder and the decoder when you have a mismatch. For example, it happens a lot that you encode a 1 but you decode a 0. In this case, you need to check the state of your CABAC context before encoding and decoding. They should be the same. If they are not the same, track back the error to find the first place of mismatch.
I hope it was helpful.

Evaluate Formal tools

What are all the factors one should consider in order to compare 3 formal verification tools?
Eg: Jaspergold, Onespin, Incisive.
From my little research, Jaspergold comes on top. But i want to do it myself on a project.
I have noted down some points such as
1.Supported languages(vhdl, sv, verilog, sva, psl,etc)
2.GUI
3.Capability(how much big design can they handle)
4.Number of Evaluation cycles
5.Performance(How fast they find proof or counter example)
With what other features can i extend this list?
Thanks!

text based RPG command interpreter

I was just playing a text based RPG and I got to wondering, how exactly were the command interpreters implemented and is there a better way to implement something similar now? It would be easy enough to make a ton of if statements, but that seems cumbersome especially considering for the most part pick up the gold is the same as pick up gold which has the same effect as take gold. I'm sure this is a really in depth question, I'd just like to know the general idea of how interpreters like that were implemented. Or if there's an open source game with a decent and representative interpreter, that would be perfect.
Answers can be language independent, but try to keep it in something reasonable, not prolog or golfscript or something. I'm not sure exactly what to tag this as.
The usual name for this sort of game is text adventure or interactive fiction, if it is single player, or MUD if it is multiplayer.
There are several special purpose programming languages for writing interactive fiction, such as Inform 6, Inform 7 (an entirely new language that compiles down to Inform 6), TADS, Hugo, and more.
Here's an example of a game in Inform 7, that has a room, an object in the room, and you can pick up, drop, and otherwise manipulate the object:
"Example Game" by Brian Campbell
The Alley is a room. "You are in a small, dark alley." A bronze key is in the
Alley. "A bronze key lies on the ground."
Produces when played:
Example Game
An Interactive Fiction by Brian Campbell
Release 1 / Serial number 100823 / Inform 7 build 6E59 (I6/v6.31 lib 6/12N) SD
Alley
You are in a small, dark alley.
A bronze key lies on the ground.
>take key
Taken.
>drop key
Dropped.
>take the key
Taken.
>drop key
Dropped.
>pick up the bronze key
Taken.
>put down the bronze key
Dropped.
>
For the multiplayer games, which tend to have simpler parsers than interactive fiction engines, you can check out a list of MUD servers.
If you would like to write your own parser, you can start by simply checking your input against regular expressions. For instance, in Ruby (as you didn't specify a language):
case input
when /(?:take|pick +up)(?: +(?:the|a))? +(.*)/
take_command(lookup_name($3))
when /(?:drop|put +down)(?: +(?:the|a))? +(.*)/
drop_command(lookup_name($3))
end
You may discover that this becomes cumbersome after a while. You could simplify it somewhat using some shorthands to avoid repetition:
OPT_ART = "(?: +(?:the|a))?" # shorthand for an optional article
case input
when /(?:take|pick +up)#{OPT_ART} +(.*)/
take_command(lookup_name($3))
when /(?:drop|put +down)#{OPT_ART} +(.*)/
drop_command(lookup_name($3))
end
This may start to get slow if you have a lot of commands, and it checks the input against each command in sequence. You also may find that it still becomes hard to read, and involves some repetition that is difficult to simply extract into shorthands.
At that point, you might want to look into lexers and parsers, a topic much too big for me to do justice to in a reply here. There are many lexer and parser generators, that given a description of a language, will produce a lexer or parser that is capable of parsing that language; check out the linked articles for some starting points.
As an example of how a parser generator would work, I'll give an example in Treetop, a Ruby based parser generator:
grammar Adventure
rule command
take / drop
end
rule take
('take' / 'pick' space 'up') article? space object {
def command
:take
end
}
end
rule drop
('drop' / 'put' space 'down') article? space object {
def command
:drop
end
}
end
rule space
' '+
end
rule article
space ('a' / 'the')
end
rule object
[a-zA-Z0-9 ]+
end
end
Which can be used as follows:
require 'treetop'
Treetop.load 'adventure.tt'
parser = AdventureParser.new
tree = parser.parse('take the key')
tree.command # => :take
tree.object.text_value # => "key"
If by 'text based RPG' you are referring to Interactive Fiction, there are specific programming languages for this. My favorite (the only one I know ;P) is Inform: http://en.wikipedia.org/wiki/Inform
The rec.arts.int-fiction FAQ has further information: http://www.plover.net/~textfire/raiffaq/FAQ.htm

Quick divisibility check in ZX81 BASIC

Since many of the Project Euler problems require you to do a divisibility check for quite a number of times, I've been trying to figure out the fastest way to perform this task in ZX81 BASIC.
So far I've compared (N/D) to INT(N/D) to check, whether N is dividable by D or not.
I have been thinking about doing the test in Z80 machine code, I haven't yet figured out how to use the variables in the BASIC in the machine code.
How can it be achieved?
You can do this very fast in machine code by subtracting repeatedly. Basically you have a procedure like:
set accumulator to N
subtract D
if carry flag is set then it is not divisible
if zero flag is set then it is divisible
otherwise repeat subtraction until one of the above occurs
The 8 bit version would be something like:
DIVISIBLE_TEST:
LD B,10
LD A,100
DIVISIBLE_TEST_LOOP:
SUB B
JR C, $END_DIVISIBLE_TEST
JR Z, $END_DIVISIBLE_TEST
JR $DIVISIBLE_TEST_LOOP
END_DIVISIBLE_TEST:
LD B,A
LD C,0
RET
Now, you can call from basic using USR. What USR returns is whatever's in the BC register pair, so you would probably want to do something like:
REM poke the memory addresses with the operands to load the registers
POKE X+1, D
POKE X+3, N
LET r = USR X
IF r = 0 THEN GOTO isdivisible
IF r <> 0 THEN GOTO isnotdivisible
This is an introduction I wrote to Z80 which should help you figure this out. This will explain the flags if you're not familiar with them.
There's a load more links to good Z80 stuff from the main site although it is Spectrum rather than ZX81 focused.
A 16 bit version would be quite similar but using register pair operations. If you need to go beyond 16 bits it would get a bit more convoluted.
How you load this is up to you - but the traditional method is using DATA statements and POKEs. You may prefer to have an assembler figure out the machine code for you though!
Your existing solution may be good enough. Only replace it with something faster if you find it to be a bottleneck in profiling.
(Said with a straight face, of course.)
And anyway, on the ZX81 you can just switch to FAST mode.
Don't know if RANDOMIZE USR is available in ZX81 but I think it can be used to call routines in assembly. To pass arguments you might need to use POKE to set some fixed memory locations before executing RANDOMIZE USR.
I remember to find a list of routines implemented in the ROM to support the ZX Basic. I'm sure there are a few to perform floating operation.
An alternative to floating point is to use fixed point math. It's a lot faster in these kind of situations where there is no math coprocessor.
You also might find more information in Sinclair User issues. They published some articles related to programming in the ZX Spectrum
You should place the values in some pre-known memory locations, first. Then use the same locations from within Z80 assembler. There is no parameter passing between the two.
This is based on what I (still) remember of ZX Spectrum 48. Good luck, but you might consider upgrading your hw. ;/
The problem with Z80 machine code is that it has no floating point ops (and no integer divide or multiply, for that matter). Implementing your own FP library in Z80 assembler is not trivial. Of course, you can use the built-in BASIC routines, but then you may as well just stick with BASIC.