Objective-C and Bison warning: stray `#' - objective-c

When I generate my parser with bison, I obtain this warning:
warning: stray `#'
But that is because I have some legal Objective-C code containing #, for instance this is one of the rules having the warning:
file : axiom production_rule_list { NSLog(#"file"); }
;
Is there any risk to use # in the code? If not, how to tell bison that it is a legitimate use of #?
Thanks in advance.

The message is just a warning. You can ignore it. If you're using Xcode, it won't even show you the warning in its Issue Navigator.
Rename your Bison input file to have a .ym extension instead of a .y extension. That tells Xcode that it's a grammar with Objective-C actions.

If you want to suppress the warning, you can use a #define AT #.
The code in the braces is just copied, apart from replacing the $… sequences with the code to give the relevant token. This appears to work fine with Objective-C, although if you're using ARC, you might need to do some digging (or just add extra blocks (in the C sense)) to make sure that objects are freed as soon as possible.

As per the documentation in Actions - Bison 2.7, it appears that the code between the curly braces is expected to be C code. As such I doubt that you can use objective-c constructs there.
However you could create an external C function to do the work for you like:
Logit(char* message)
{
NSLog(#"%s",message);
}
And use that in the Bison action
file : axiom production_rule_list { Logit("file"); }
;

Related

Why there is "1 related problem" on public class WelcomeMessageListener implements Listener [duplicate]

Please explain the following about "Cannot find symbol", "Cannot resolve symbol" or "Symbol not found" errors (in Java):
What do they mean?
What things can cause them?
How does the programmer go about fixing them?
This question is designed to seed a comprehensive Q&A about these common compilation errors in Java.
0. Is there any difference between these errors?
Not really. "Cannot find symbol", "Cannot resolve symbol" and "Symbol not found" all mean the same thing. (Different Java compilers are written by different people, and different people use different phraseology to say the same thing.)
1. What does a "Cannot find symbol" error mean?
Firstly, it is a compilation error1. It means that either there is a problem in your Java source code, or there is a problem in the way that you are compiling it.
Your Java source code consists of the following things:
Keywords: like class, while, and so on.
Literals: like true, false, 42, 'X' and "Hi mum!".
Operators and other non-alphanumeric tokens: like +, =, {, and so on.
Identifiers: like Reader, i, toString, processEquibalancedElephants, and so on.
Comments and whitespace.
A "Cannot find symbol" error is about the identifiers. When your code is compiled, the compiler needs to work out what each and every identifier in your code means.
A "Cannot find symbol" error means that the compiler cannot do this. Your code appears to be referring to something that the compiler doesn't understand.
2. What can cause a "Cannot find symbol" error?
As a first order, there is only one cause. The compiler looked in all of the places where the identifier should be defined, and it couldn't find the definition. This could be caused by a number of things. The common ones are as follows:
For identifiers in general:
Perhaps you spelled the name incorrectly; i.e. StringBiulder instead of StringBuilder. Java cannot and will not attempt to compensate for bad spelling or typing errors.
Perhaps you got the case wrong; i.e. stringBuilder instead of StringBuilder. All Java identifiers are case sensitive.
Perhaps you used underscores inappropriately; i.e. mystring and my_string are different. (If you stick to the Java style rules, you will be largely protected from this mistake ...)
Perhaps you are trying to use something that was declared "somewhere else"; i.e. in a different context to where you have implicitly told the compiler to look. (A different class? A different scope? A different package? A different code-base?)
For identifiers that should refer to variables:
Perhaps you forgot to declare the variable.
Perhaps the variable declaration is out of scope at the point you tried to use it. (See example below)
For identifiers that should be method or field names:
Perhaps you are trying to refer to an inherited method or field that wasn't declared in the parent / ancestor classes or interfaces.
Perhaps you are trying to refer to a method or field that does not exist (i.e. has not been declared) in the type you are using; e.g. "rope".push()2.
Perhaps you are trying to use a method as a field, or vice versa; e.g. "rope".length or someArray.length().
Perhaps you are mistakenly operating on an array rather than array element; e.g.
String strings[] = ...
if (strings.charAt(3)) { ... }
// maybe that should be 'strings[0].charAt(3)'
For identifiers that should be class names:
Perhaps you forgot to import the class.
Perhaps you used "star" imports, but the class isn't defined in any of the packages that you imported.
Perhaps you forgot a new as in:
String s = String(); // should be 'new String()'
Perhaps you are trying to import or otherwise use a class that has been declared in the default package; i.e. the one where classes with no package statements go.
Hint: learn about packages. You should only use the default package for simple applications that consist of one class ... or at a stretch, one Java source file.
For cases where type or instance doesn't appear to have the member (e.g. method or field) you were expecting it to have:
Perhaps you have declared a nested class or a generic parameter that shadows the type you were meaning to use.
Perhaps you are shadowing a static or instance variable.
Perhaps you imported the wrong type; e.g. due to IDE completion or auto-correction may have suggested java.awt.List rather than java.util.List.
Perhaps you are using (compiling against) the wrong version of an API.
Perhaps you forgot to cast your object to an appropriate subclass.
Perhaps you have declared the variable's type to be a supertype of the one with the member you are looking for.
The problem is often a combination of the above. For example, maybe you "star" imported java.io.* and then tried to use the Files class ... which is in java.nio not java.io. Or maybe you meant to write File ... which is a class in java.io.
Here is an example of how incorrect variable scoping can lead to a "Cannot find symbol" error:
List<String> strings = ...
for (int i = 0; i < strings.size(); i++) {
if (strings.get(i).equalsIgnoreCase("fnord")) {
break;
}
}
if (i < strings.size()) {
...
}
This will give a "Cannot find symbol" error for i in the if statement. Though we previously declared i, that declaration is only in scope for the for statement and its body. The reference to i in the if statement cannot see that declaration of i. It is out of scope.
(An appropriate correction here might be to move the if statement inside the loop, or to declare i before the start of the loop.)
Here is an example that causes puzzlement where a typo leads to a seemingly inexplicable "Cannot find symbol" error:
for (int i = 0; i < 100; i++); {
System.out.println("i is " + i);
}
This will give you a compilation error in the println call saying that i cannot be found. But (I hear you say) I did declare it!
The problem is the sneaky semicolon ( ; ) before the {. The Java language syntax defines a semicolon in that context to be an empty statement. The empty statement then becomes the body of the for loop. So that code actually means this:
for (int i = 0; i < 100; i++);
// The previous and following are separate statements!!
{
System.out.println("i is " + i);
}
The { ... } block is NOT the body of the for loop, and therefore the previous declaration of i in the for statement is out of scope in the block.
Here is another example of "Cannot find symbol" error that is caused by a typo.
int tmp = ...
int res = tmp(a + b);
Despite the previous declaration, the tmp in the tmp(...) expression is erroneous. The compiler will look for a method called tmp, and won't find one. The previously declared tmp is in the namespace for variables, not the namespace for methods.
In the example I came across, the programmer had actually left out an operator. What he meant to write was this:
int res = tmp * (a + b);
There is another reason why the compiler might not find a symbol if you are compiling from the command line. You might simply have forgotten to compile or recompile some other class. For example, if you have classes Foo and Bar where Foo uses Bar. If you have never compiled Bar and you run javac Foo.java, you are liable to find that the compiler can't find the symbol Bar. The simple answer is to compile Foo and Bar together; e.g. javac Foo.java Bar.java or javac *.java. Or better still use a Java build tool; e.g. Ant, Maven, Gradle and so on.
There are some other more obscure causes too ... which I will deal with below.
3. How do I fix these errors ?
Generally speaking, you start out by figuring out what caused the compilation error.
Look at the line in the file indicated by the compilation error message.
Identify which symbol that the error message is talking about.
Figure out why the compiler is saying that it cannot find the symbol; see above!
Then you think about what your code is supposed to be saying. Then finally you work out what correction you need to make to your source code to do what you want.
Note that not every "correction" is correct. Consider this:
for (int i = 1; i < 10; i++) {
for (j = 1; j < 10; j++) {
...
}
}
Suppose that the compiler says "Cannot find symbol" for j. There are many ways I could "fix" that:
I could change the inner for to for (int j = 1; j < 10; j++) - probably correct.
I could add a declaration for j before the inner for loop, or the outer for loop - possibly correct.
I could change j to i in the inner for loop - probably wrong!
and so on.
The point is that you need to understand what your code is trying to do in order to find the right fix.
4. Obscure causes
Here are a couple of cases where the "Cannot find symbol" is seemingly inexplicable ... until you look closer.
Incorrect dependencies: If you are using an IDE or a build tool that manages the build path and project dependencies, you may have made a mistake with the dependencies; e.g. left out a dependency, or selected the wrong version. If you are using a build tool (Ant, Maven, Gradle, etc), check the project's build file. If you are using an IDE, check the project's build path configuration.
Cannot find symbol 'var': You are probably trying to compile source code that uses local variable type inference (i.e. a var declaration) with an older compiler or older --source level. The var was introduced in Java 10. Check your JDK version and your build files, and (if this occurs in an IDE), the IDE settings.
You are not compiling / recompiling: It sometimes happens that new Java programmers don't understand how the Java tool chain works, or haven't implemented a repeatable "build process"; e.g. using an IDE, Ant, Maven, Gradle and so on. In such a situation, the programmer can end up chasing his tail looking for an illusory error that is actually caused by not recompiling the code properly, and the like.
Another example of this is when you use (Java 9+) java SomeClass.java to compile and run a class. If the class depends on another class that you haven't compiled (or recompiled), you are liable to get "Cannot resolve symbol" errors referring to the 2nd class. The other source file(s) are not automatically compiled. The java command's new "compile and run" mode is not designed for running programs with multiple source code files.
An earlier build problem: It is possible that an earlier build failed in a way that gave a JAR file with missing classes. Such a failure would typically be noticed if you were using a build tool. However if you are getting JAR files from someone else, you are dependent on them building properly, and noticing errors. If you suspect this, use tar -tvf to list the contents of the suspect JAR file.
IDE issues: People have reported cases where their IDE gets confused and the compiler in the IDE cannot find a class that exists ... or the reverse situation.
This could happen if the IDE has been configured with the wrong JDK version.
This could happen if the IDE's caches get out of sync with the file system. There are IDE specific ways to fix that.
This could be an IDE bug. For instance #Joel Costigliola described a scenario where Eclipse did not handle a Maven "test" tree correctly: see this answer. (Apparently that particular bug was been fixed a long time ago.)
Android issues: When you are programming for Android, and you have "Cannot find symbol" errors related to R, be aware that the R symbols are defined by the context.xml file. Check that your context.xml file is correct and in the correct place, and that the corresponding R class file has been generated / compiled. Note that the Java symbols are case sensitive, so the corresponding XML ids are be case sensitive too.
Other symbol errors on Android are likely to be due to previously mention reasons; e.g. missing or incorrect dependencies, incorrect package names, method or fields that don't exist in a particular API version, spelling / typing errors, and so on.
Hiding system classes: I've seen cases where the compiler complains that substring is an unknown symbol in something like the following
String s = ...
String s1 = s.substring(1);
It turned out that the programmer had created their own version of String and that his version of the class didn't define a substring methods. I've seen people do this with System, Scanner and other classes.
Lesson: Don't define your own classes with the same names as common library classes!
The problem can also be solved by using the fully qualified names. For example, in the example above, the programmer could have written:
java.lang.String s = ...
java.lang.String s1 = s.substring(1);
Homoglyphs: If you use UTF-8 encoding for your source files, it is possible to have identifiers that look the same, but are in fact different because they contain homoglyphs. See this page for more information.
You can avoid this by restricting yourself to ASCII or Latin-1 as the source file encoding, and using Java \uxxxx escapes for other characters.
1 - If, perchance, you do see this in a runtime exception or error message, then either you have configured your IDE to run code with compilation errors, or your application is generating and compiling code .. at runtime.
2 - The three basic principles of Civil Engineering: water doesn't flow uphill, a plank is stronger on its side, and you can't push on a rope.
You'll also get this error if you forget a new:
String s = String();
versus
String s = new String();
because the call without the new keyword will try and look for a (local) method called String without arguments - and that method signature is likely not defined.
One more example of 'Variable is out of scope'
As I've seen that kind of questions a few times already, maybe one more example to what's illegal even if it might feel okay.
Consider this code:
if(somethingIsTrue()) {
String message = "Everything is fine";
} else {
String message = "We have an error";
}
System.out.println(message);
That's invalid code. Because neither of the variables named message is visible outside of their respective scope - which would be the surrounding brackets {} in this case.
You might say: "But a variable named message is defined either way - so message is defined after the if".
But you'd be wrong.
Java has no free() or delete operators, so it has to rely on tracking variable scope to find out when variables are no longer used (together with references to these variables of cause).
It's especially bad if you thought you did something good. I've seen this kind of error after "optimizing" code like this:
if(somethingIsTrue()) {
String message = "Everything is fine";
System.out.println(message);
} else {
String message = "We have an error";
System.out.println(message);
}
"Oh, there's duplicated code, let's pull that common line out" -> and there it it.
The most common way to deal with this kind of scope-trouble would be to pre-assign the else-values to the variable names in the outside scope and then reassign in if:
String message = "We have an error";
if(somethingIsTrue()) {
message = "Everything is fine";
}
System.out.println(message);
SOLVED
Using IntelliJ
Select Build->Rebuild Project will solve it
One way to get this error in Eclipse :
Define a class A in src/test/java.
Define another class B in src/main/java that uses class A.
Result : Eclipse will compile the code, but maven will give "Cannot find symbol".
Underlying cause : Eclipse is using a combined build path for the main and test trees. Unfortunately, it does not support using different build paths for different parts of an Eclipse project, which is what Maven requires.
Solution :
Don't define your dependencies that way; i.e. don't make this mistake.
Regularly build your codebase using Maven so that you pick up this mistake early. One way to do that is to use a CI server.
"Can not find " means that , compiler who can't find appropriate variable, method ,class etc...if you got that error massage , first of all you want to find code line where get error massage..And then you will able to find which variable , method or class have not define before using it.After confirmation initialize that variable ,method or class can be used for later require...Consider the following example.
I'll create a demo class and print a name...
class demo{
public static void main(String a[]){
System.out.print(name);
}
}
Now look at the result..
That error says, "variable name can not find"..Defining and initializing value for 'name' variable can be abolished that error..Actually like this,
class demo{
public static void main(String a[]){
String name="smith";
System.out.print(name);
}
}
Now look at the new output...
Ok Successfully solved that error..At the same time , if you could get "can not find method " or "can not find class" something , At first,define a class or method and after use that..
If you're getting this error in the build somewhere else, while your IDE says everything is perfectly fine, then check that you are using the same Java versions in both places.
For example, Java 7 and Java 8 have different APIs, so calling a non-existent API in an older Java version would cause this error.
There can be various scenarios as people have mentioned above. A couple of things which have helped me resolve this.
If you are using IntelliJ
File -> 'Invalidate Caches/Restart'
OR
The class being referenced was in another project and that dependency was not added to the Gradle build file of my project. So I added the dependency using
compile project(':anotherProject')
and it worked. HTH!
If eclipse Java build path is mapped to 7, 8 and in Project pom.xml Maven properties java.version is mentioned higher Java version(9,10,11, etc..,) than 7,8 you need to update in pom.xml file.
In Eclipse if Java is mapped to Java version 11 and in pom.xml it is mapped to Java version 8. Update Eclipse support to Java 11 by go through below steps in eclipse IDE
Help -> Install New Software ->
Paste following link http://download.eclipse.org/eclipse/updates/4.9-P-builds at Work With
or
Add (Popup window will open) ->
Name: Java 11 support
Location: http://download.eclipse.org/eclipse/updates/4.9-P-builds
then update Java version in Maven properties of pom.xml file as below
<java.version>11</java.version>
<maven.compiler.source>${java.version}</maven.compiler.source>
<maven.compiler.target>${java.version}</maven.compiler.target>
Finally do right click on project Debug as -> Maven clean, Maven build steps
I too was getting this error. (for which I googled and I was directed to this page)
Problem: I was calling a static method defined in the class of a project A from a class defined in another project B.
I was getting the following error:
error: cannot find symbol
Solution: I resolved this by first building the project where the method is defined then the project where the method was being called from.
you compiled your code using maven compile and then used maven test to run it worked fine. Now if you changed something in your code and then without compiling you are running it, you will get this error.
Solution: Again compile it and then run test. For me it worked this way.
In my case - I had to perform below operations:
Move context.xml file from src/java/package to the resource directory (IntelliJ
IDE)
Clean target directory.
For hints, look closer at the class name name that throws an error and the line number, example:
Compilation failure
[ERROR] \applications\xxxxx.java:[44,30] error: cannot find symbol
One other cause is unsupported method of for java version say jdk7 vs 8.
Check your %JAVA_HOME%
We got the error in a Java project that is set up as a Gradle multi-project build. It turned out that one of the sub-projects was missing the Gradle Java Library plugin.
This prevented the sub-project's class files from being visible to other projects in the build.
After adding the Java library plugin to the sub-project's build.gradle in the following way, the error went away:
plugins {
...
id 'java-library'
}
Re: 4.4: An earlier build problem in Stephen C's excellent answer:
I encountered this scenario when developing an osgi application.
I had a java project A that was a dependency of B.
When building B, there was the error:
Compilation failure: org.company.projectA.bar.xyz does not exist
But in eclipse, there was no compile problem at all.
Investigation
When i looked in A.jar, there were classes for org.company.projectA.foo.abc but none for org.company.projectA.bar.xyz.
The reason for the missing classes, was that in the A/pom.xml, was an entry to export the relevant packages.
<plugin>
<groupId>org.apache.felix</groupId>
<artifactId>maven-bundle-plugin</artifactId>
...
<configuration>
<instructions>
....
<Export-Package>org.company.projectA.foo.*</Export-Package>
</instructions>
</configuration>
</plugin>
Solution
Add the missing packages like so:
<Export-Package>org.company.projectA.foo.*,org.company.projectA.bar.*</Export-Package>
and rebuild everything.
Now the A.jar includes all the expected classes, and everything compiles.
I was getting below error
java: cannot find symbol
symbol: class __
To fix this
I tried enabling lambok, restarted intellij, etc but below worked for me.
Intellij Preferences ->Compiler -> Shared Build process VM Options and set it to
-Djps.track.ap.dependencies=false
than run
mvn clean install
Optional.isEmpty()
I was happily using !Optional.isEmpty() in my IDE, and it works fine, as i was compiling/running my project with >= JDK11. Now, when i use Gradle on the command line (running on JDK8), i got the nasty error in the compile task.
Why?
From the docs (Pay attention to the last line):
boolean java.util.Optional.isEmpty()
If a value is not present, returns true, otherwise false.
Returns:true if a value is not present, otherwise false
Since:11
I solved this error like this... The craziness of android. I had the package name as Adapter and the I refactor the name to adapter with an "a" instead of "A" and solved the error.

Is "Implicit token definition in parser rule" something to worry about?

I'm creating my first grammar with ANTLR and ANTLRWorks 2. I have mostly finished the grammar itself (it recognizes the code written in the described language and builds correct parse trees), but I haven't started anything beyond that.
What worries me is that every first occurrence of a token in a parser rule is underlined with a yellow squiggle saying "Implicit token definition in parser rule".
For example, in this rule, the 'var' has that squiggle:
variableDeclaration: 'var' IDENTIFIER ('=' expression)?;
How it looks exactly:
The odd thing is that ANTLR itself doesn't seem to mind these rules (when doing test rig test, I can't see any of these warning in the parser generator output, just something about incorrect Java version being installed on my machine), so it's just ANTLRWorks complaining.
Is it something to worry about or should I ignore these warnings? Should I declare all the tokens explicitly in lexer rules? Most exaples in the official bible The Defintive ANTLR Reference seem to be done exactly the way I write the code.
I highly recommend correcting all instances of this warning in code of any importance.
This warning was created (by me actually) to alert you to situations like the following:
shiftExpr : ID (('<<' | '>>') ID)?;
Since ANTLR 4 encourages action code be written in separate files in the target language instead of embedding them directly in the grammar, it's important to be able to distinguish between << and >>. If tokens were not explicitly created for these operators, they will be assigned arbitrary types and no named constants will be available for referencing them.
This warning also helps avoid the following problematic situations:
A parser rule contains a misspelled token reference. Without the warning, this could lead to silent creation of an additional token that may never be matched.
A parser rule contains an unintentional token reference, such as the following:
number : zero | INTEGER;
zero : '0'; // <-- this implicit definition causes 0 to get its own token
If you're writing lexer grammar which wouldn't be used across multiple parser grammmar(s) then you can ignore this warning shown by ANTLRWorks2.

Why can't I put the opening braces on the next line?

Encountered a strange error when I tried to compile following code:
package main
import fmt "fmt"
func main()
{
var arr [3]int
for i:=0; i<3; i++
{
fmt.Printf("%d",arr[i])
}
}
Error is as follows:
unexpected semicolon or newline before {
After correction following code worked:
package main
import fmt "fmt"
func main(){
var arr [3]int
for i:=0; i<3; i++{
fmt.Printf("%d",arr[i])
}
}
Is GO language this much strictly Typed? And this doesn't have warnings also. Should this not be a programmers choice how he wants to format his code?
Go language warnings and errors
The Go language does automatic semicolon insertion, and thus the only allowed place for { is at the end of the preceding line. Always write Go code using the same style as gofmt produces and you will have no problems.
See Go's FAQ: Why are there braces but no semicolons? And why can't I put the opening brace on the next line?
go language includes semicolons with a specific rule, in your case, the newline after the i++ introduces a semicolon before the '{'. see http://golang.org/doc/go_spec.html.
formatting is somewhat part of the language, use gofmt to make code look similar, however, you can format your code many different ways.
Should this not be a programmers choice how he wants to format his
code?
Maybe. I think it is nice that Go steps forward to avoid some bike-shedding, like never ending style discussions. There is even a tool, gofmt, that formats code in a standard style, ensuring that most Go code follows the same guidelines. It is like they were saying: "Consistency everywhere > personal preferences. Get used to it, This Is Good(tm)."
Go code has a required bracing style.
In the same way that a programmer can't choose to use braces in python and is required to use indentation.
The required bracing style allows the semicolon insertion to work without requiring the parser to look ahead to the next line(which is useful if you want to implement a REPL for GO code)
package main
func main();
is valid Go code and without looking at the next line the parser assumes this is what you meant and is then confused by the block that isn't connected to anything that you've put after it.
Having the same bracing style through all Go code makes it a lot easier to read and also avoids discussion about bracing style.
Go lang fallows strict rules to maintain the unique visibility for the reader like Python, use visual code IDE, it will do automatic formatting and error detection.

Write a compiler for a language that looks ahead and multiple files?

In my language I can use a class variable in my method when the definition appears below the method. It can also call methods below my method and etc. There are no 'headers'. Take this C# example.
class A
{
public void callMethods() { print(); B b; b.notYetSeen();
public void print() { Console.Write("v = {0}", v); }
int v=9;
}
class B
{
public void notYetSeen() { Console.Write("notYetSeen()\n"); }
}
How should I compile that? what i was thinking is:
pass1: convert everything to an AST
pass2: go through all classes and build a list of define classes/variable/etc
pass3: go through code and check if there's any errors such as undefined variable, wrong use etc and create my output
But it seems like for this to work I have to do pass 1 and 2 for ALL files before doing pass3. Also it feels like a lot of work to do until I find a syntax error (other than the obvious that can be done at parse time such as forgetting to close a brace or writing 0xLETTERS instead of a hex value). My gut says there is some other way.
Note: I am using bison/flex to generate my compiler.
My understanding of languages that handle forward references is that they typically just use the first pass to build a list of valid names. Something along the lines of just putting an entry in a table (without filling out the definition) so you have something to point to later when you do your real pass to generate the definitions.
If you try to actually build full definitions as you go, you would end up having to rescan repatedly, each time saving any references to undefined things until the next pass. Even that would fail if there are circular references.
I would go through on pass one and collect all of your class/method/field names and types, ignoring the method bodies. Then in pass two check the method bodies only.
I don't know that there can be any other way than traversing all the files in the source.
I think that you can get it down to two passes - on the first pass, build the AST and whenever you find a variable name, add it to a list that contains that blocks' symbols (it would probably be useful to add that list to the corresponding scope in the tree). Step two is to linearly traverse the tree and make sure that each symbol used references a symbol in that scope or a scope above it.
My description is oversimplified but the basic answer is -- lookahead requires at least two passes.
The usual approach is to save B as "unknown". It's probably some kind of type (because of the place where you encountered it). So you can just reserve the memory (a pointer) for it even though you have no idea what it really is.
For the method call, you can't do much. In a dynamic language, you'd just save the name of the method somewhere and check whether it exists at runtime. In a static language, you can save it in under "unknown methods" somewhere in your compiler along with the unknown type B. Since method calls eventually translate to a memory address, you can again reserve the memory.
Then, when you encounter B and the method, you can clear up your unknowns. Since you know a bit about them, you can say whether they behave like they should or if the first usage is now a syntax error.
So you don't have to read all files twice but it surely makes things more simple.
Alternatively, you can generate these header files as you encounter the sources and save them somewhere where you can find them again. This way, you can speed up the compilation (since you won't have to consider unchanged files in the next compilation run).
Lastly, if you write a new language, you shouldn't use bison and flex anymore. There are much better tools by now. ANTLR, for example, can produce a parser that can recover after an error, so you can still parse the whole file. Or check this Wikipedia article for more options.

Writing a TemplateLanguage/VewEngine

Aside from getting any real work done, I have an itch. My itch is to write a view engine that closely mimics a template system from another language (Template Toolkit/Perl). This is one of those if I had time/do it to learn something new kind of projects.
I've spent time looking at CoCo/R and ANTLR, and honestly, it makes my brain hurt, but some of CoCo/R is sinking in. Unfortunately, most of the examples are about creating a compiler that reads source code, but none seem to cover how to create a processor for templates.
Yes, those are the same thing, but I can't wrap my head around how to define the language for templates where most of the source is the html, rather than actual code being parsed and run.
Are there any good beginner resources out there for this kind of thing? I've taken a ganer at Spark, which didn't appear to have the grammar in the repo.
Maybe that is overkill, and one could just test-replace template syntax with c# in the file and compile it. http://msdn.microsoft.com/en-us/magazine/cc136756.aspx#S2
If you were in my shoes and weren't a language creating expert, where would you start?
The Spark grammar is implemented with a kind-of-fluent domain specific language.
It's declared in a few layers. The rules which recognize the html syntax are declared in MarkupGrammar.cs - those are based on grammar rules copied directly from the xml spec.
The markup rules refer to a limited subset of csharp syntax rules declared in CodeGrammar.cs - those are a subset because Spark only needs to recognize enough csharp to adjust single-quotes around strings to double-quotes, match curley braces, etc.
The individual rules themselves are of type ParseAction<TValue> delegate which accept a Position and return a ParseResult. The ParseResult is a simple class which contains the TValue data item parsed by the action and a new Position instance which has been advanced past the content which produced the TValue.
That isn't very useful on it's own until you introduce a small number of operators, as described in Parsing expression grammar, which can combine single parse actions to build very detailed and robust expressions about the shape of different syntax constructs.
The technique of using a delegate as a parse action came from a Luke H's blog post Monadic Parser Combinators using C# 3.0. I also wrote a post about Creating a Domain Specific Language for Parsing.
It's also entirely possible, if you like, to reference the Spark.dll assembly and inherit a class from the base CharGrammar to create an entirely new grammar for a particular syntax. It's probably the quickest way to start experimenting with this technique, and an example of that can be found in CharGrammarTester.cs.
Step 1. Use regular expressions (regexp substitution) to split your input template string to a token list, for example, split
hel<b>lo[if foo]bar is [bar].[else]baz[end]world</b>!
to
write('hel<b>lo')
if('foo')
write('bar is')
substitute('bar')
write('.')
else()
write('baz')
end()
write('world</b>!')
Step 2. Convert your token list to a syntax tree:
* Sequence
** Write
*** ('hel<b>lo')
** If
*** ('foo')
*** Sequence
**** Write
***** ('bar is')
**** Substitute
***** ('bar')
**** Write
***** ('.')
*** Write
**** ('baz')
** Write
*** ('world</b>!')
class Instruction {
}
class Write : Instruction {
string text;
}
class Substitute : Instruction {
string varname;
}
class Sequence : Instruction {
Instruction[] items;
}
class If : Instruction {
string condition;
Instruction then;
Instruction else;
}
Step 3. Write a recursive function (called the interpreter), which can walk your tree and execute the instructions there.
Another, alternative approach (instead of steps 1--3) if your language supports eval() (such as Perl, Python, Ruby): use a regexp substitution to convert the template to an eval()-able string in the host language, and run eval() to instantiate the template.
There are sooo many thing to do. But it does work for on simple GET statement plus a test. That's a start.
http://github.com/claco/tt.net/
In the end, I already had too much time in ANTLR to give loudejs' method a go. I wanted to spend a little more time on the whole process rather than the parser/lexer. Maybe in version 2 I can have a go at the Spark way when my brain understands things a little more.
Vici Parser (formerly known as LazyParser.NET) is an open-source tokenizer/template parser/expression parser which can help you get started.
If it's not what you're looking for, then you may get some ideas by looking at the source code.