Mistake in list grammar example in Bjarne's programming book? - grammar

At this moment I am reading paragraph 6.4.2 of Bjarne Stroustrups book 'Programming Principles and Practice using C++'. In this paragraph he demonstrates the following list grammar:
List:
"{" Sequence "}"
Sequence:
Element
Element "," Sequence
Element:
"A"
"B"
He says that the following are Lists according to the grammar above:
{ A }
{ B }
{ A,B }
{A,A,A,A,B }
Shouldn't Element "," Sequence be Sequence "," Element to make { A,B } and {A,A,A,A,B } correct Lists according to this grammar?
How I understand this grammar, A is a Sequence. That makes B the Element, right?

Element "," Sequence and Sequence "," Element are equivalent here.
Both A and B are Elements:
Element:
"A"
"B"
But every Element is also a valid Sequence:
Sequence:
Element
Basically,
Sequence:
Element "," Sequence
means "if you have an element, a comma, and a sequence, that forms another sequence". I.e. this rule lets you add elements at the beginning of a sequence to extend it.
Sequence:
Sequence "," Element
means "if you have a sequence, a comma, and an element, that forms another sequence". This rule lets you add elements at the end of a sequence to extend it.
In either case the end result is a list of (comma separated) elements.

Left and right recursions are equivalent, in the end all ocurrences of Sequence must be replaced by Element and the order won't matter.
{ A , B }:
List
{ Sequence }
{ Element , Sequence }
{ Element , Element }
{ A , B}
{ A , A , A , A , B }:
List
{ Sequence }
{ Element , Sequence }
{ Element , Element , Sequence }
{ Element , Element , Element , Sequence }
{ Element , Element , Element , Element , Sequence }
{ Element , Element , Element , Element , Element }
{ A , A , A , A , B }
So, both can be generated from the grammar and so are correct lists.

In the List {A,A,A,A,B}, A,A,A,A,B is a Sequence.
This is further decomposed to A, A,A,A,B where A is the Element (convenient as it is a single terminal), and A,A,A,B is a Sequence. This continues until you reach two single terminals making up the sequence, which are both elements.

Related

Run a regex on a Supply or other stream-like sequence?

Suppose I have a Supply, Channel, IO::Handle, or similar stream-like source of text, and I want to scan it for substrings matching a regex. I can't be sure that matching substrings do not cross chunk boundaries. The total length is potentially infinite and cannot be slurped into memory.
One way this would be possible is if I could instantiate a regex matching engine and feed it chunks of text while it maintains its state. But I don't see any way to do that -- I only see methods to run the match engine to completion.
Is this possible?
After some more searching, I may have answered my own question. Specifically, it seems Seq.comb is capable of combining chunks and lazily processing them:
my $c = supply {
whenever Supply.interval(1.0) -> $v {
my $letter = do if ($v mod 2 == 0) { "a" } else { "b" };
my $chunk = $letter x ($v + 1);
say "Pushing {$chunk}";
emit($chunk);
}
};
my $c2 = $c.comb(/a+b+/);
react {
whenever $c2 -> $v {
say "Got {$v}";
}
}
See also the concurrency features used to construct this example.

Find a subsequence in a list

Lets assume that we are looking for a sequence in a list and this sequence should satisfy some conditions, for example I have a series of numbers like this:
[1,2,4,6,7,8,12,13,14,15,20]
I need to find the largest sequence so that its consecutive elements have a difference of 1, So what I expected to get is:
[12,13,14,15]
I'm curious if there is any way to get in Kotlin Sequence or inline functions like groupBy or something else.
PS: I know how to create sequences, the question is how to evaluate and extract some sequences with given conditions.
There is no built in functionality for this "sequence" recognition, but you could solve it with the fold operation:
val result = listOf(1, 2, 3, 12, 13, 14, 15)
.distinct() // remove duplicates
.sorted() // by lowest first
.fold(mutableListOf<Int>() to mutableListOf<List<Int>>()) { (currentList, allLists), currentItem ->
if (currentList.isEmpty()) { // Applies only to the very first item
mutableListOf(currentItem) to allLists
} else {
if (currentItem - currentList.max()!! == 1) { // Your custom sequence recognition 'difference of 1'
currentList.apply { add(currentItem) } to allLists
} else {
mutableListOf(currentItem) to allLists.apply { add(currentList) } // Next
}
}
}
.let { it.second.apply { add(it.first) } } // Add last list
.maxBy { it.size } // We need the longest pattern - which will take first of the stack - it could be multiple.
// If you need more precise list, sort them by your criteria

How to list all keys of a generic yaml by yaml-cpp

If a yaml document contains a mix of sequences and maps and scalars, and those collection types are themselves multi-level deep, is there a built-in function or an easy way to list all the keys, but not the final value at the leaf? Assuming the keys are strings.
You'll have to recurse on the nodes in your document, checking the type of each:
switch (node.Type()) {
case Null: // ...
case Scalar: // ...
case Sequence:
for (auto it = node.begin(); it != node.end(); ++it) {
auto element = *it;
// recurse on "element"
}
break;
case Map:
for (auto it = node.begin(); it != node.end(); ++it) {
auto key = it->first;
auto value = it->second;
// recurse on "key" and "value"
// if you're sure that "key" is a string, just grab it here
}
break;
case Undefined: // ...
}

ANTLR4 change listener during parse

I have an ANTLR4 listener which handles a standard and well-formed grammar, however am struggling with how to deal the non-standard implementations. Although all of the variants go through the lexer without problems the parse stage is a lot trickier.
A traditional way of doing this would be something like
// Header of document
variant = STANDARD;
if (header.indexOf("microsoft") != -1) {
variant = MICROSOFT;
} else if (header.indexOf("google") != -1) {
variant = GOOGLE;
}
...
// Parsing a particular element
if (variant.equals(MICROSOFT)) {
// Microsoft-specific stuff
} else if (variant.equals(GOOGLE)) {
// Google-specific stuff
} else {
// Standard stuff
}
but this quickly becomes unmaintainable. The obvious solution is to have a ParseTreeListener for the standard implementation and then subclass it for each variant, but I don't know which variant it is until I've started the parse.
So how can I either switch from one listener to another part-way through the parse, or restart the parse with a new listener once I know which variant I'm dealing with?
If these variants occur frequently, you might want to consider embedding custom code to handle context sensitive parsing by using predicates (the {...}? construct in the following pseudo grammar):
rule
: { boolean-expression-a }? a-alternative
| { boolean-expression-b }? b-alternative
| /* fall through */ not-a-or-b-alternative
;
Let's say you want to parse a file containing chunks. A chunk consists of a header and a data row. In the header you can set your variant. The data of a normal variant contains 3 NUMBERs, Google's variant contains 2 NUMBERs and Microsoft's variant contains a single NUMBER. An example of such a file would look like this:
header: none
data: 1 2 3
header: google
data: 4 5
header: microsoft
data: 6
And here's a demo of a context sensitive ANTLR v4 grammar able to parse this:
grammar T;
#parser::members {
enum Variant {
GOOGLE,
MICROSOFT,
OTHER;
public static Variant tryValueOf(String name) {
try {
return Variant.valueOf(name.toUpperCase());
}
catch(Exception e) {
return OTHER;
}
}
}
private Variant variant = Variant.OTHER;
}
parse
: chunk+ EOF
;
chunk
: header data
;
header
: K_HEADER COLON NAME {variant = Variant.tryValueOf($NAME.text);}
;
data
: {variant == Variant.MICROSOFT}? K_DATA COLON NUMBER #MicrosoftData
| {variant == Variant.GOOGLE}? K_DATA COLON NUMBER NUMBER #GoogleData
| K_DATA COLON NUMBER NUMBER NUMBER #OtherData
;
K_DATA : 'data';
K_HEADER : 'header';
NAME : [a-zA-Z]+;
NUMBER : [0-9]+;
COLON : ':';
SPACE : [ \t\r\n] -> skip;
Resulting in the following parse:

Conventions for naming class operations?

What conventions do you use for naming class operations?
Full word doc : Download C# Coding Standards & Best Practices
Naming Conventions and Standards
Note :
The terms Pascal Casing and Camel Casing are used throughout this document.
Pascal Casing - First character of all words are Upper Case and other characters are lower case.
Example: BackColor
Camel Casing - First character of all words, except the first word are Upper Case and other characters are lower case.
Example: backColor
Use Pascal casing for Class names
public class HelloWorld
{
...
}
Use Pascal casing for Method names
void SayHello(string name)
{
...
}
Use Camel casing for variables and method parameters
int totalCount = 0;
void SayHello(string name)
{
string fullMessage = "Hello " + name;
...
}
Use the prefix β€œI” with Camel Casing for interfaces ( Example: IEntity )
Do not use Hungarian notation to name variables.
In earlier days most of the programmers liked it - having the data type as a prefix for the variable name and using m_ as prefix for member variables. Eg:
string m_sName;
int nAge;
However, in .NET coding standards, this is not recommended. Usage of data type and m_ to represent member variables should not be used. All variables should use camel casing.
Some programmers still prefer to use the prefix m_ to represent member variables, since there is no other easy way to identify a member variable.
Use Meaningful, descriptive words to name variables. Do not use abbreviations.
Good:
string address
int salary
Not Good:
string nam
string addr
int sal
Do not use single character variable names like i, n, s etc. Use names like index, temp
One exception in this case would be variables used for iterations in loops:
for ( int i = 0; i < count; i++ )
{
...
}
If the variable is used only as a counter for iteration and is not used anywhere else in the loop, many people still like to use a single char variable (i) instead of inventing a different suitable name.
Do not use underscores (_) for local variable names.
All member variables must be prefixed with underscore (_) so that they can be identified from other local variables.
Do not use variable names that resemble keywords.
Prefix boolean variables, properties and methods with β€œis” or similar prefixes.
Ex: private bool _isFinished
Namespace names should follow the standard pattern
...
Use appropriate prefix for the UI elements so that you can identify them from the rest of the variables.
There are 2 different approaches recommended here.
a. Use a common prefix ( ui_ ) for all UI elements. This will help you group all of the UI elements together and easy to access all of them from the intellisense.
b. Use appropriate prefix for each of the ui element. A brief list is given below. Since .NET has given several controls, you may have to arrive at a complete list of standard prefixes for each of the controls (including third party controls) you are using.
Control Prefix
Label lbl
TextBox txt
DataGrid dtg
Button btn
ImageButton imb
Hyperlink hlk
DropDownList ddl
ListBox lst
DataList dtl
Repeater rep
Checkbox chk
CheckBoxList cbl
RadioButton rdo
RadioButtonList rbl
Image img
Panel pnl
PlaceHolder phd
Table tbl
Validators val
File name should match with class name.
For example, for the class HelloWorld, the file name should be helloworld.cs (or, helloworld.vb)
Use Pascal Case for file names.
Indentation and Spacing
Use TAB for indentation. Do not use SPACES. Define the Tab size as 4.
Comments should be in the same level as the code (use the same level of indentation).
Good:
// Format a message and display
string fullMessage = "Hello " + name;
DateTime currentTime = DateTime.Now;
string message = fullMessage + ", the time is : " + currentTime.ToShortTimeString();
MessageBox.Show ( message );
Not Good:
// Format a message and display
string fullMessage = "Hello " + name;
DateTime currentTime = DateTime.Now;
string message = fullMessage + ", the time is : " + currentTime.ToShortTimeString();
MessageBox.Show ( message );
Curly braces ( {} ) should be in the same level as the code outside the braces.
Use one blank line to separate logical groups of code.
Good:
bool SayHello ( string name )
{
string fullMessage = "Hello " + name;
DateTime currentTime = DateTime.Now;
string message = fullMessage + ", the time is : " + currentTime.ToShortTimeString();
MessageBox.Show ( message );
if ( ... )
{
// Do something
// ...
return false;
}
return true;
}
Not Good:
bool SayHello (string name)
{
string fullMessage = "Hello " + name;
DateTime currentTime = DateTime.Now;
string message = fullMessage + ", the time is : " + currentTime.ToShortTimeString();
MessageBox.Show ( message );
if ( ... )
{
// Do something
// ...
return false;
}
return true;
}
There should be one and only one single blank line between each method inside the class.
The curly braces should be on a separate line and not in the same line as if, for etc.
Good:
if ( ... )
{
// Do something
}
Not Good:
if ( ... ) {
// Do something
}
Use a single space before and after each operator and brackets.
Good:
if ( showResult == true )
{
for ( int i = 0; i < 10; i++ )
{
//
}
}
Not Good:
if(showResult==true)
{
for(int i= 0;i<10;i++)
{
//
}
}
Use #region to group related pieces of code together. If you use proper grouping using #region, the page should like this when all definitions are collapsed.
Keep private member variables, properties and methods in the top of the file and public members in the bottom.
I find it makes everyone's life easier to use the same naming conventions used by the language and framework you are working in.
For example, .Net has a convention. Model what your language does, and the "users" of your code and libraries will be happier. So, the answer may be, it depends on your language and / or platform...
Naming conventions are a controversial topic, because it's an arbitrary distinction.
The two answers above are good ones. My addition is this:
Your goal is readability. Your code tells a story, albeit a sometimes kind of boring one. Make sure the story is clear.
For extra fun see these links:
http://www.joelonsoftware.com/articles/Wrong.html
http://en.wikipedia.org/wiki/Naming_convention_%28programming%29