Embed a Language interpreter but hook/control variable resolution? - variables

Sorry, if the topic title does not convey the problem.
As part of a project we want to expose an expressive language to the User, mostly for defining simple expressions, but possibly the ability to write procedures as well as any complex calculations they might want to do with the data.
Of course, the natural choice would be to expose an entire language like Python (maybe with some project-specific functions to ease user programmability) and then invoking a Python interpreter from the application code. That is fine..
However, the requirement is that in this language, any variable resolution (say $data etc) needs to be done by our code, since it needs to be fetched specifically from various sources. Of course, once the data is fetched then the embedded language (say Python) has complete ownership to modify it in any way.
So, what might be the most elegant way to do this ? Embed a language but ability to hook the variable resolution. We could write a pre-processor which checks for the variables and replaces it with the raw data and then gives it to the embedded language interpreter. But, we would prefer having a hook-mechanism so that we are called for resolution of a variable...
Hope the Q is clear and thanks in advance.

Lua. www.lua.org

Related

How to improve my programming language?

Hello everybody I have been making a programming language for some time now as a good learning experience etc and improving my programming skills.
It can use a lot of improvements and I'm sure somebody might be willing to find things that I can improve or help make it better.
My programming language has syntax, arrays etc very similar to PHP but with macro features like AutoIt and RegEx syntax of Perl etc. It is a bit of a mix and match of many features I liked most about other languages.
It includes a vast number of functions as shown here
Function Reference
With a pretty largely documented language features and syntax here
Language Reference
I'm looking to improve my language in every way possible which includes but not limited to finding bugs, creating test cases (to test all features and report success) etc.
I'm looking for people willing to help out or test or try make things and see how well it goes or perhaps find it useful and enjoy it.
If you can or know somebody who will be willing to help improve my language let me know.
Project Goals:
Get everything tested and make sure it all works (I can estimate at least 90+% works correctly)
Create a test script for every function where all tests get run from a single script (partially started)
Create a new GUI system (the current one works fine and can produce good applications but needs a remake)
Add another 1000 functions (Specifically all the stuff that's lacking such as Date/XML etc)
Create a series of games in Sputnik (So far I have completed one game which is a full Pengo remake of the original Amiga game it looks/works/runs exactly same as original)
Create a proper IDE (Even if that means using Eclipse or something) for Sputnik the current IDE is made in Sputnik and although its not bad it is lacking a ton of stuff that more advanced IDE's have
Complete the XNA library for Sputnik or drop it in favor of SDL/OpenGL
Support all .NET types natively (Currently this is only partial but yet surprisingly good seeing DotNet on wiki will show what's done so far on that)
Need a Linux+Mac DLL creating to provide all additional features to Sputnik specific for them platforms (Sputnik runs on Mono) currently only Windows gets a beefy dll and provides around 200 extra functions to Sputnik.
Finalize the grammar so it never changes to do this it will need to be perfect
Fix the wiki so all functions have correct argument names (instead of expression, expression2) and also fix all Return 1,0 to true/false (Boolean was added later to Sputnik after hundreds of functions were documented...)
Need to make a very extensive Win32 include scripts for Sputnik (Sputnik supports calling DLL files and creating C++ style structures for use in such DLL calls) so it can use all windows APIs directly
I want something similar to LPEG for Sputnik of course Sputniks build in parser is very powerful but it does require an IDE to generate it's grammar sheets where as something like LPEG could be done in user code
I want to complete LINQ in Sputnik so far only Where() is complete and that is just a prototype it wants a complete LINQ implementation (I like LINQ)
Design philosophy
Sputnik supports the Perl idea of "There is an insane number of ways to do one thing" (as demonstrated by having an Unless to go with the If and so on) that said Sputnik code can be very clear and easy to understand and very simplified.
I believe in strongly shortened code and will always seek to use the lowest amount of code possible to get the job done (As long as it's the fastest)
Sputnik includes the "my" keyword to make a value be local scope only same a Perl this helps with a good design
Operator lgi
Test if first value is lower or higher than the second (both treated as strings, case insensitive).
Who asked for that feature?

Are there any interpreted languages in which you can dynamically modify the interpreter?

I've been thinking about this writing (apparently) by Mark Twain in which he starts off writing in English but throughout the text makes changes to the rules of spelling so that by the end he ends up with something probably best described as pseudo-German.
This made me wonder if there is interpreter for some established language in which one has access to the interpreter itself, so that you can change the syntax and structure of the language as you go along. For example, often an if clause is a keyword; is there a language that would let you change or redefine this on the fly? Imagine beginning a console session in one language, and by the end, working in another.
Clearly one could write an interpreter and run it, and perhaps there is no concrete distinction between doing this and modifying the interpreter. I'm not sure about this. Perhaps there are limits to the modifications you can make dynamically to any given interpreter?
These more open questions aside, I would simply like to know if there are any known interpreters that allow this at all? Or, perhaps, this ability is just a matter of extent and my question is badly posed.
There are certainly languages in which this kind of self-modifying behavior at the level of the language syntax itself is possible. Lisp programs can contain macros, which allow among other things the creation of new control constructs on the fly, to the extent that two Lisp programs that depend on extensive macro programming can look almost as if they are written in two different languages. Forth is somewhat similar in that a Forth interpreter provides a core set of just a dozen or so primitive operations on which a program must be built in the language of the problem domain (frequently some kind of real-world interaction that must be done precisely and programmatically, such as industrial robotics). A Forth programmer creates an interpreter that understands a language specific to the problem he or she is trying to solve, then writes higher-level programs in that language.
In general the common idea here is that of languages or systems that treat code and data as equivalent and give the user just as much power to modify one as the other. Every Lisp program is a Lisp data structure, for example. This is in contrast to a language such as Java, in which a sharp distinction is made between the program code and the data that it manipulates.
A related subject is that of self-modifying low-level code, which was a fairly common technique among assembly-language programmers in the days of minicomputers with complex instruction sets, and which spilled over somewhat into the early 8-bit and 16-bit microcomputer worlds. In this programming idiom, for purposes of speed or memory savings, a program would be written with the "awareness" of the location where its compiled or interpreted instructions would be stored in memory, and could alter in place the actual machine-level instructions byte by byte to affect its behavior on the fly.
Forth is the most obvious thing I can think of. It's concatenative and stack based, with the fundamental atom being a word. So you write a stream of words and they are performed in the order in which they're written with the stack being manipulated explicitly to effect parameter passing, results, etc. So a simple Forth program might look like:
6 3 + .
Which is the words 6, 3, + and .. The two numbers push their values onto the stack. The plus symbol pops the last two items from the stack, adds them and pushes the result. The full stop outputs whatever is at the top of the stack.
A fundamental part of Forth is that you define your own words. Since all words are first-class members of the runtime, in effect you build an application-specific grammar. Having defined the relevant words you might end up with code like:
red circle draw
That wold draw a red circle.
Forth interprets each sequence of words when it encounters them. However it distinguishes between compile-time and ordinary words. Compile-time words do things like have a sequence of words compiled and stored as a new word. So that's the equivalent of defining subroutines in a classic procedural language. They're also the means by which control structures are implemented. But you can also define your own compile-time words.
As a net result a Forth program usually defines its entire grammar, including relevant control words.
You can read a basic introduction here.
Prolog is an homoiconic language, allowing meta interpreters (MIs) to be declined in a variety of ways. A meta interpreter - interpreting the interpreter - is a common and useful native construct in Prolog.
See this page for an introduction to this argument. An interesting and practical technique illustrated is partial execution:
The overhead incurred by implementing these things using MIs can be compiled away using partial evaluation techniques.

What is the definition of modular scripting in the FileMaker world?

How do you define modular scripting in the FileMaker context? I am not providing my definition yet on purpose. I want to know what you think. Thanks!
A modular script is one that performs a useful function with no external dependencies outside that script. This is in contrast to what I'll call a 'one-shot' script, which takes few or no parameters but has dependencies specific to the file that it is being used in.
The ideal modular script takes zero inputs, performs some useful function, and requires no processing of its results. An example of this would be a script that resizes the current window to center the current window on the screen. Because there are no I/O hookups and nothing to be altered outside the script itself, there is no cost to use this script.
More practical examples will require input parameters and output results. However, keep in mind that as the number and complexity of parameter passing increases, the benefit of modularity decreases. There is a tipping point at which the simplicity of 'one-shot', non-modular scripts that require few or no parameters is the better choice.
Modular Scripting in FileMaker embodies the spirit of object oriented programming. I.e., scripts should be modeled as a collection of interoperable functional objects/modules with a narrow focus. In FileMaker, these modules should favor values passed via parameter in lieu of being derived from the current context. Script modules should return results (e.g., success, fail, canceled, etc) as well as values that might be required in a calling script. Larger routines should rely upon many smaller modules to perform a task, allowing you to pinpoint failures easily, and allowing modules to be reused for many tasks.
Modular Scripting is a way of writing scripts so that each and every script, when copied as is to another solution, will simply work properly when performed at any time.
To "work properly" means to correctly recognize its own context and parameters and either perform the correct action or report the correct error/result code in compliance with the documentation which is included with the script as a leading comment.
Modular Scripting in FileMaker adapts the inheritance property of object-oriented programming to the particular grain of how FileMaker works. Modular Scripting aspires to be as copy-and-pastable as possible by recognizing that FileMaker is not an object-oriented platform, but a context-oriented platform.
Modular Scripts may control themselves by value-based parameters passed to them by the calling context or by identifying the operating context for themselves. Modular Scripts may depend on certain patterned structures in a FileMaker system, but may not depend on any particular schema or context beyond what the script is told via parameters or can infer (such as via Get() and Design functions).
For example, a modular "Print Report" script may need to be told what layout to print, and may even require that the found set be sorted by an OnLayoutLoad or OnModeEnter trigger, but a modular Print Report script would rather not require a specific layout named "Print Report Layout" or a specific "Table::SortThis" field unless these are common to multiple distinct applications of the script in a given solution.
So a single Modular Script can be called to perform the same task as appropriate for many different contexts.

How not to repeat yourself across projects and/or languages

I'm working on several distinct but related projects in different programming languages. Some of these projects need to parse filenames written by other projects, and expect a certain filename pattern.
This pattern is now hardcoded in several places and in several languages, making it a maintenance bomb. It is fairly easy to define this pattern exactly once in a given project, but what are the techniques for defining it once and for all for all projects and for all languages in use?
Creating a Domain Specific Language, then compile that into the code for each of the target languages that you are using would be the best solution (and most elegant).
Its not difficult to make a DSL - wither embed it in something (like inside Ruby since its the 'in' thing right now, or another language like LISP/Haskell...), or create a grammar from scratch (use Antlr?). It seems like the project is large, then this path is worth your while.
I'd store the pattern in a simple text file and, depending on a particular project:
Embed it in the source at build time (preprocessing)
If the above is not an option, treat it as a config file read at runtime
Edit: I assume the pattern is something no more complicated than a regex, otherwise I'd go with the DSL solution from another answer.
You could use a common script, process or web service for generating the file names (depending on your set-up).
I don't know which languages you are speaking about but most of languages can use external dynamic libraries dlls/shared objects and export common functionality from this library.
For example you implement function get file name in simple c lib and use acrros rest of languages.
Another option will be to create common code dynamically as part of the build process for each language this should not be to complex.
I will suggest using dynamic link approach if feasible (you did not give enough information to determine this),since maintaining this solution will be much easier then maintaining code generation for different languages.
Put the pattern in a database - the easiest and comfortable way could be using XML database. This database will be accessible by all the projects and they will read the pattern from there

what would be the impediments to creating an "Europanto" type universal scripting language?

After switching back and forth between several scripting languages this week, I found myself thinking how similar they all are. Yet I'm always reaching for Google (or nowadays SO) to remember details like what the local equivalents of "instanceof" and "endswith" are, or the right syntax to declare an interface, or whatever.
This reminded me of the (human) language Europonto. Just pick some vaguely English syntax and some vaguely Romance/Germanic/Slavic vocabulary, and it's all good!
So what would happen if we tried to do the same thing with a scripting language. In the mood for Python-style indented blocks today? Fine! Want to use a prototype object? Ok! Can only remember how to spell the PHP names of some library function? No problem!
Anyway, that's the wild and crazy idea. Since we need a question that admits concrete answers, let's tighten it up like this:
What would be the most significant conflicts in creating a scripting language that permitted all the native syntax and library functions of [Python, Ruby, PHP, Perl, shell, and JavaScript], such that you could freely intermix code blocks and function names between languages?
And let's say that any particular construction should be consistent at the statement level. So we'll allow:
foreach( $foo as $bar )
{
if $foo == 2:
print "hi"
}
but not, say,
foreach( $foo as $bar )
{
if $foo == 2:
print "hi"
endif
end
Conflicts can include: parser ambiguities; name collision; conflicting semantics for objects or functions or closures; etc. I'm guessing that scope will be a ginormous issue, but you tell me.
I'll start this as "community wiki" from the get go, so if you think it's a fun question but want to make it more rigorous, feel free to edit.
I would suggest that the main problem is recognising what the syntax of each statement is supposed to be.
In any case, what is the point? Almost all scripting languages have facilities to do much the same things, which is why people tend to master one that they use consistently, and stick with it.
The main difficulty would to be to allow people maintain it. With a well defined language you can only print a certain way and do sys.argv a certain way. once you allow multiple syntaxes there is no sane way to search for all the sys.argv in the code base you have.
At the syntactical level the only problem I can see would be to detect which block has which syntax, then separate them and parse them with specific parsers. Of course given very small statements there could be ambiguities as to which language it is and you could argue that it doesn't matter, but it just may be the case, that in different languages the same string of characters does different things so this could be a subtle issue.
At the API level you would have lots of different methods of doing the same thing but in a subtly different way or subset of doing it. So for example you could have no way of doing Java's string.startsWith() in let's say PHP, so you would do something different, or no way of doing PHP's strstr() (which returns a part of the string from the found needle to the end) and you would implement something different for that or even think differently about the problem. Then you would have to have all those different API methods of doing the same things and that would be huge API to implement, support and (god forbid) learn.
At the wetware level the code written by others would be totally unreadable unless you know a ton of languages and their subtle differences. I think it is difficult enough to learn a single programming language to the smallest details and so it is not practical at all to have this kind of frankensteinish beast created. I can think of an exception for use as an algorithm description language which it already is used in universities all over the world, where teacher takes some language of his liking and makes the code as readable as it can be for a human without needing to implement a parser for it.
As a side note I think this kind of system could be implemented at the least effort by somehow utilizing .NET's CLR where you have a ton of different languages each compiling to the same bytecode and accessing the same variables and stuff. All you'd need to do is split the code to clusters of different languages, then compile them separately on their respective compilers and then just merge the bytecode and somehow make sure they all point to the same variables and functions when mentioning the same names across the different languages.
I have begun to see that syntax is but one property of a language. And most of them look like C to me. The purpose of a language (object oriented, strong typing, etc) is something else again. It starts to look like syntax is not the most important aspect.
I went and read the wikipedia entry...
Europanto is a linguistic jest presented as a "constructed language" with a hodge-podge vocabulary
"Hodge-podge" sounds like the way Perl has been described to me!
I found a rather detailed discussion of closures in Ruby. It sounds like getting Ruby's behavior to coexist with JavaScript's or Python's would require some kind of ugly disambiguation.
If anybody were to add Perl to the list of languages to be covered, I think its lexical scoping rules would present a related problem?