What is a good, simple scripting language to embed into a macro-processor - scripting

I want to write a macroprocessor. So far I've done a very simple sketch of how it should look and I came to the conclusion that inventing a completely new language would not be a good idea but I should reuse existing concepts. My sketch so far is a kind of irb with some tex-alike syntax and features, but I'm not sure what I should use as ruby-substitute.
The language should be simple, yet powerful. I don't want to write an OS in it, but it should be less "raw" than e.g. bc or forth. I don't care about execution time at all. Embedding should not be too hard and it'll be nice if the language itself was stable.
So far I've considered these:
Lua - It should process text easily. Lua does not even have a while(c=getchar()){}. I'm skeptic.
awk - Simple, text processing is easy, but never intended for embedding
perl - Way to complex, stable, but it seems almost dead.
python - Significant whitespaces; won't they get in the way for inlined function-definitions?
groovy/nice/java - Hard/impossible to embed? Also way to heavy.
javascript - Really like it (besides DOM) but is there a stable/embeddable implementation? I don't want to mess around with the api every 2 weeks when there's a new v8 version. As I said, I don't care about execution time.
I have not really found any pros/cons for
io
guile/scheme
TCL
Update: The language should have features such as function-definition, library-loading or regexps (loops would also be very nice) I don't want to use a traditional macro-language such as M4 because I want to able to write in a more procedural (or maybe functional) style. Macro languages have their pros, but I requires a completely new way of thinking about a problem which is hard especially for beginners. My Aim is to use the best of both worlds.

Given that TCL is about string and array processing, and is intended for embedding, it would seem an obvious choice.
Luatex has a certain following. Presumably they have found a way to make it work for text processing, so you might like to look at that.
Scheme (including guile) is also very nice for scripting; alternatively you might look at whether there is a way you could embed an elisp processor (embed xemacs?), which after all is all about text processing.

Related

Avoid using new language features because unfamiliar to most programmers?

While reading "Python scripting for computational science" I came across the following text in the section discussing generators:
Whether to rapidly write a generator or to implement the class methods __iter__ and __next__ depends on the application, personal tast, readibility, and complexity of the iterator. Since generators are very compact and unfamiliar to most programmers, the code often becomes less readable than a corresponding version using __iter__ and __next__.
This led me to wonder whether unfamiliarity (of other programmers) is a good reason NOT to use relatively new and powerful features of a language (like Python generators).
If you don't use it, how can it ever become popular and familiar?
So, my question: is unfamiliarity sometimes a good reason not to use new language features?
Your own unfamiliarity with a language feature may be a good reason to tread lightly. For example, in C#, if you aren't certain about the differences between object y = func1() ?? func2(); and object y = func1() != null ? func1() : func2(); (hint: left-to-right order of evaluation), then maybe you are better off writing the corresponding if clause just because it's clearer what actually is going on. Someone who knows the nuances of the language better may very well come around and refactor later, and in the meantime, the cost is usually low.
However, if you know how to use a language feature, I see little reason to avoid using it simply because others may find it difficult to understand. If you really feel the need to, then add a comment (such as, perhaps, "?? is the _null coalescing operator_") to help fellow developers know what to look for if they can't figure out from the code alone what it is doing, and you are afraid that they may have to go it alone.
This, mind you, is about production code. Experimenting certainly has its place, but its place is not necessarily in the mainline codebase. I always keep a "scratch" project handy for when I want to try something out without risking impact to anything else. There, I often take liberties far beyond those I take in production or to-be-production code.
I wouldn't say that unfamilarity is a good reason to not use new language features. Or for that matter, use new languages.
Lack of support for a new feature across tool vendors could be a reason if you have any concerns about working with multiple vendors.
Since the question is subjective, I'll express the contrary opinion.
If you work where there are code reviews, you'll find out soon enough what your co-workers consider "unfamiliar".
Since they also have to maintain the code, you can try and help them become familiar with the "unfamiliar" code. But, it's ultimately a judgment call, and sometimes, what you think is clear code, isn't.

Converting Actionscript syntax to Objective C

I have a game I wrote in Actionscript 3 I'm looking to port to iOS. The game has about 9k LOC spread across 150 classes, most of the classes are for data models, state handling and level generation all of which should be easy to port.
However, the thought of rejiggering the syntax by hand across all these files is none too appealing. Are there tools that can help me speed up this process?
I'm not looking for a magical tool here, nor am I looking for a cross compiler, I just want some help converting my source files.
I don't know of a tool, but this is the way I'd try and attack your problem if there really is a lot of (simple) code to convert. I'm sure my suggestion is not that useful on parts of the code that are very flash-specific (all the DisplayObject stuff?) and also not that useful on lots of your logic. But it would be fun to build! :-)
Partial automatic conversion should be possible, especially if the objects are just 'data containers', watch out for bringing too much as3-idiom over to objective-c though, it might not always be a good fit.
Unless you want to create your own (semi) parser for as3 you'd need some sort of a parser, apparently FlexPMD has one (never used it), and there probably are others.
After getting your hands on a parser you have to find some way of suggesting to the system what parts could be converted automatically. You could try and add rules to the parser/generator script for the general case. For more specific cases I'd use custom metadata on the actual class/property/method, assuming a real as3 parser would correctly parse those.
Now part of your work will shift from hand-converting files to hand-annotating files, but that might be ok for you.
Have the parser parse your classes and define actions based on your metadata that will determine what kind of objective-c class to generate. If you get this working it could at least get you all your classes, their simple properties and method signatures (getting the body of the methods converted might be a bit too much to ask but you could include it as a comment so you'd have a nice reference while hand-translating).
PS: if you make this into a one way process be very sure you don't need to re-generate it later - it would be bad if you find out that you have been modifying the generated code and somehow need to re-generate all those classes -- that would mean you'll have to redo all your hard work!
I've started putting a tool together to take the edge off the menial aspects of this process.
I'm trying to figure out if there's enough interest to make it clean and stable enough to release for others to use. I may just do it anyway.
http://meanwhileatthelab.blogspot.com.au/2012/08/automating-process-of-converting-as3-to.html
It's so far saving me a lot of time while porting one of my fairly large games from AS3 to objc.
Check out the Sparrow Framework. It's purported to be designed with Actionscript developers in mind, recreating classes that sort of emulate display list and things like that. You'll have to dive into some "rejiggering" for sure no matter what you do if you don't want to use the CS5 packager.
http://www.sparrow-framework.org/
even if some solution exists, note that architectural logic is DIFFERENT, and many more other details.
Anyway even if posible, You will have a strange hybrid.
I am coming back from WWDC2012, and the message is (as always..) performance anf great user experience.
So You should rewrite using a different programming model.

Real time scripting language + MS DLR?

For starters I should let you guys know what I'm trying to do. The project I'm working on has a requirement that requires a custom scripting system to be built. This will be used by non-programmers who are using the application and should be as close to natural language as possible. An example would be if the user needs to run a custom simulation and plot the output, the code they would write would need to look like
variable input1 is 10;
variable input2 is 20;
variable value1 is AVERAGE(input1, input2);
variable condition1 is true;
if condition1 then PLOT(value1);
Might not make a lot of sense, but its just an example. AVERAGE and PLOT are functions we'd like to define, they shouldn't be allowed to change them or really even see how they work. Is something like this possible with DLR? If not what other options would we have(start with ANTRL to define the grammar and then move on?)? In the future this may need to run using XBAP and WPF too, so this is also something we need to consider, but haven't seen much if anything on dlr & xbap. Thanks, and hopefully this all makes sense.
Lua is not an option as it is to different from what they are already accustomed to.
Ralf, its going to reactive, and to be honest the timeframe for when the results should get back to the user may be 1/100 of a second all the way up to 2 weeks or a month(very complex mathematical functions).
Basically they already have a system they purchased that does some of what they need, and included a custom scripting language that does what I mentioned above and they don't want to have to learn a new one, they basically just want us to copy it and add functionality. I think I'll just start with ANTRL and go from there.
Lua
it's small, fast, easy to embed, portable, extensible, and fun!
Lua is definitly the best choice for soft real-time system (like computer games).
See http://shootout.alioth.debian.org/ for detailed benchmarks.
However, last time I checked, Lua used a mark-and-sweep garbage collector which can lead to deadline-violation and non-deterministic jitter in real-time systems.
I believe that you could use theoretically use the DLR, but I'm unsure about support in an XBAP (partially trusted?) scenario.
If you host the DLR you would quickly be able to take advantage of IronRuby or IronPython scripting. You would want to look at these implementations when creating your own language implementation. If you post your question to the IronPython mailing list I'm sure you would get a better reply around the XBAP scenario, and some of the developers there created ToyScript.
What kind of real-time requirement are you trying to fulfill? Is the simulation a hard real-time simulation (some kind of hardware-in-the-loop simulation ==> deadline is less than 1/1000 second)?
Or do you want the scripting-system to be "reactive" to user-input ==> 1/10 should be sufficient.
I am no expert regarding MS DLR, but as far as I know, it does not support hard real-time systems. You may want to take a look at the real-time specification for Java (RTSJ)
Firstly I think that defining your own language is not the way to go.
Primarily because the biggest productivity gains you can get for programmers or non-programmers are the development tools. You (and 99.9% of the rest of us) are not going to write tools as good as what is out their.
Language design is hard.
Language support and documentation, also hard
I would recommend looking for a pre-built solution. If you could find a language that can lock down some functionality, that would be a good starting point. MatLab would be the first that comes to my mind.
Lastly, ditch the natural language part, BASIC, COBOL and YA-TDWTF-Lang all tried and failed at it.
Full disclosure: I work for a company that is developing a generalized domain specific language "system". It's targeted at data-in/text-out applications so it's not apropos and it's not yet to beta. The result is I'm somewhat knowledgeable and biased.

Moving from static language to dynamic

There are a lot of discussions all over the internet and on SO, i.e. here and here, about static vs dynamic languages.
I'm not going to ask again about one vs another. Instead, my question is for those who moved (or at least tried to move) from static typed language to dynamic.
I'm not talking about moderate usage of JS on your web page or other scripting language embedded into statically typed software or small personal scripts. I mean moving to dynamic language as your primary general purpose language for developing production quality software in team.
Was that easy? What was the biggest advantage and the biggest challenge? Was it fun? :)
UPD: Did you find IDE support good enough? Did you find that you need less IDE support?
Was that easy?
Moderately. Some Java-isms are hard habits to break. My first six months, I wrote Python with ;'s. Icky. Once I was over it, though, I haven't looked back.
What was the biggest advantage?
Moving from the "write -> compile -> build -> run -> break -> debug -> write" cycle to a "write -> run -> break -> write" cycle. It takes time to get used to immediate gratification from the Python command-line interpreter. I was soooo used to endless design and planning before attempting to write (much less compile) any code.
At first I considered the python command line to be a kind of "education-only" interface. Then reading docstrings, doctests, and user guides where the application is being typed at the >>> prompt, I started to realize that the truly great Python software boils complexity and nuance down to stuff you can type interactively.
[I wish I could design stuff that worked that cleanly.]
What was the biggest challenge?
Multiple inheritance. I use it very rarely.
Was it fun?
So far.
It's also amazingly productive. More time with user requirements and real data. Less time planning an inheritance hierarchy with proper interfaces to capture meaning and compile correctly and be extensible enough to last at least to the next revision.
If I were you, I would try Scala!!!.
Scala has some aspects really interesting that lets you feel like doing dynamic, while doing static.
Scala is a statically typed language
with dynamic typed smell, because the
compiler makes you less repetitive
inferring your assignments.
A compiled language with a warm and
wonderful script flavor.Cause you can use the scala console, or even write scripts just like ruby or python. So you can choose between "write -> compile -> build -> run -> break -> debug -> write" or "write -> run -> break -> write" as S.Lott said.
Scala is a complete Functional
language with full support for OO. So you don't lose many important OO aspects like inheritance, encapsulation, polymorphism, etc.
Why answering you questions suggesting Scala? Because I tryed script languages before, and the main was Ruby. And it was just like S.Lott said. But not so easy for me and my team. Most of time static is safe, less error prone, and even faster if you have the right language.
Answering you three questions putting Scala inside we have:
Was that easy?
Yes. Sometimes you need to concentrate to leave you old concepts aside and go deep.
What was the biggest advantage?
You feel in home cause you don't need to change you environment or rewrite existing applications to migrate to Scala (talking about Java). If you come from Java, you can start playing with Scala after reading some articles. Not too much effort. Another important advantage is the use of a functional language en its embedded power.
Was it fun?
Sure! Changing your mind, changing your way to solve problems to the best is for sure funny.
This is my vision. You don't exactly need to leave off static to grab the advantage of dynamic.
Nice question.
I am now working in Ruby, PHP and ActionScript (the least dynamic of the three) instead of languages that I would prefer, like Java and C#. But beggars, I mean, workers in this economy, can't be choosers. Or rather, you have to choose your battles and your master.
It's hard to compare Ruby and Java because they've got more than one difference, and you only asked about the dynamic/static thing (and not even about the strongly vs. weakly-typed thing!). But on that front, what affects me most is always the IDE. I was always horrified when other Java programmers used Notepad or Textpad to write code, and nowadays there are just too many advantages of a good IDE for that madness. Not true with Ruby! I use Netbeans and it does really well, but one of the main differences is that I have to actually type code. Autocomplete, for me, was/is a way of life (I write SMS messages in full English/Spanish with the predictive dictionary, for instance, and never use abbreviations) and writing Ruby code does require more work.
So at first it was painful and I was constantly looking at, for instance, function names of classes that I had written (or that are part of Ruby) just to get the spelling right! So that sucked, I thought, and I continued to think that until...
I moved back to ActionScript the other day, and to get my IDE autocompleting (FlashDevelop or FlexBuilder) I declare all variables with types (strongly-typed by choice, if you will)... and suddenly I thought what a friggin' hassle!
And then today I had to do some feature additions on a Ruby project and it felt free and cool. The code is clean, and why would I be informing the IDE of what I'm trying to write anyway?
So I would say that 1) the biggest challenges are learning the language and the framework you're working in, like always 2) it's been amazingly fun and deeply eye-opening. New languages always carry new things with them, but dynamic languages just feel different. And that's just the kind of thing that gets you to wake up at 7am and do some coding on a Sunday morning before falling asleep again.
I like programming and like most of you, I've spent some time with stored procedures, XSL, static, dynamic, whatever... it's all fun, and they all feel totally different. In the end, the framework you are working in will be the thing that will convince you too stay or not (if you have a choice), I think, but languages are to learned, studied and experienced, not compared.
I can't qualify myself fully under that handle but I did spend a while writing some an interesting Python mini-game after having spent many years writing Java. So, I might be mixing a little bit of moving from compiled to interpreted along with it.
I found myself using notation to mimic static typing. :)
However, I did find myself cranking code out at a slightly better clip. Having an interpreter is a godsend as far as learning new language/writing new code. The shorter the time between finishing a line of code and seeing it work, the faster you can write, and I think that is probably the best thing most dynamic and interpreted languages.
My code didn't look too different, all things considered. Though, Python has a lot of fun data structures. :)
I'm also interested in this topic.
Tried do dive into Ruby and Rails a while ago, and it really helped me to grasp the ASP.Net MVC stuff, which i think is a bit too chalenging at first for average .net developer.
If you're interested more on moving in this direction, or curious about how some developers moved from static to dynamic languages as their full time jobs, i highly recommend this Alt.Net podcast.

Which scripting language to support in an existing codebase?

I'm looking at adding scripting functionality to an existing codebase and am weighing up the pros/cons of various packages. Lua is probably the most obvious choice, but I was wondering if people have any other suggestions based on their experience.
Scripts will be triggered upon certain events and may stay resident for a period of time. For example upon startup a script may define several options which the program presents to the user as a number of buttons. Upon selecting one of these buttons the program will notify the script where further events may occur.
These are the only real requirements;
Must be a cross-platform library that is compilable from source
Scripts must be able to call registered code-side functions
Code must be able to call script-side functions
Be used within a C/C++ codebase.
Based on my own experience:
Python. IMHO this is a good choice. We have a pretty big code base with a lot of users and they like it a lot.
Ruby. There are some really nice apps such as Google Sketchup that use this. I wrote a Sketchup plugin and thought it was pretty nice.
Tcl. This is the old-school embeddable scripting language of choice, but it doesn't have a lot of momentum these days. It's high quality though, they use it on the Hubble Space Telescope!
Lua. I've only done baby stuff with it but IIRC it only has a floating point numeric type, so make sure that's not a problem for the data you will be working with.
We're lucky to be living in the golden age of scripting, so it's hard to make a bad choice if you choose from any of the popular ones.
I have played around a little bit with Spidermonkey. It seems like it would at least be worth a look at in your situation. I have heard good things about Lua as well. The big argument for using a javascript scripting language is that a lot of developers know it already and would probably be more comfortable from the get go, whereas Lua most likely would have a bit of a learning curve.
I'm not completely positive but I think that spidermonkey your 4 requirements.
I've used Python extensively for this purpose and have never regretted it.
Lua is has the most straight-forward C API for binding into a code base that I've ever used. In fact, I usually quickly roll bindings for it by hand. Whereas, you often wouldn't consider doing so without a generator like swig for others. Also, it's typically faster and more light weight than the alternatives, and coroutines are a very useful feature that few other languages provide.
AngelScript
lets you call standard C functions and C++ methods with no need for proxy functions. The application simply registers the functions, objects, and methods that the scripts should be able to work with and nothing more has to be done with your code. The same functions used by the application internally can also be used by the scripting engine, which eliminates the need to duplicate functionality.
For the script writer the scripting language follows the widely known syntax of C/C++ (with minor changes), but without the need to worry about pointers and memory leaks.
The original question described Tcl to a "T".
Tcl was designed from the beginning to be an embedded scripting language. It has evolved to be a first class dynamic language in its own right but still is used all over the world as an embeded language. It is available under the BSD license so it is just about as free as it gets. It also compiles on pretty much any moden platform, and many not-so-modern. And not only does it work on desktop systems, there are variations available for mobile platforms.
Tcl excels as a "glue" language, where you can write performance-intensive functions in C while still benefiting from the advantages of a scripting language for less performance critical parts of the application.
Tcl also comes with a first class GUI toolkit (Tk) that is arguably one of the easiest cross platform GUI toolkits available. It also interfaces very nicely with SQLite and other databases, and has had built-in support for unicode for quite some time.
If the scripting interface will be made available to your customers (as opposed to simply enabling your own engineers to work at the scripting level), Tcl is extremely easy to learn as there are a total of only 12 rules that govern the entire language (as of tcl 8.6). In fact, Tcl shines as a way to invent domain specific languages which is often how it is used as an end-user scripting solution.
There were some excellent suggestions already, but I just wanted to mention that Perl can also be called / can call to C/C++.
You probably could use any modern scripting / bytecode language.
If you're willing to put up with the growing pains of a new product, you could use the Parrot VM. Which has support for many, if not all of the languages listed on this page. Unfortunately it's not done yet, but that hasn't stopped some people from using it in a production environment.
I think most people are probably mentioning the scripting language that they are most familiar with. From my perspective, Tcl was designed specifically to interface with C, so your problem domain is tailor-made for the language. However, I'm sure Python, Perl, or Lua would be fine. You should probably choose the language that is most familiar to your current team, since that will reduce the learning time.