Is it possible to create a quine in every turing-complete language? - language-design

I just wanted to know if it is 100% possible, if my language is turing-complete, to write a program in it that prints itself out (of course not using a file reading function)
So if the language just has the really necessary things in order to make it turing complete (I would prove that by translating Brainf*ck code to it), like output, variables, conditions and gotos (hell yes, gotos), can I try writing a quine in it?
I'm also asking this because I'm not sure that a quine directly fits into Turing's law that the turing machine is capable of any computational task.
I just want to know so I don't try for years without knowing that it may be impossible.

Any programming language which is
Turing complete, and which is able to
output any string (by a computable
function of the string as program —
this is a technical condition that is
satisfied in every programming
language in existence) has a quine
program (and, in fact, infinitely many
quine programs, and many similar
curiosities) as follows by the
fixed-point theorem.
See here

I ran into this issue a couple of months ago.
While writing a quine doesn't necessarily prove that a language is Turing Complete, it is a strong suggestion ;) As far as Turing Completeness goes, if you can (like you said) provide a valid translation from your language to another Turing-Complete language, then your language is Turing Complete.
That being said, any language that is Turing Complete that can output a string should be able to generate a quine. Also, from Wikipedia:
A quine is a fixed point of an execution environment, when the execution environment is viewed as a function. Quines are possible in any programming language that has the ability to output any computable string, as a direct consequence of Kleene's recursion theorem. For amusement, programmers sometimes attempt to develop the shortest possible quine in any given programming language.

It is possible to have a programming language that cannot print all the symbols in its representation. For example, the I/O may be limited to 7-bit ASCII characters with language keywords in Arabic. That's the only exception I can think of.

Well, technically, not always. According to the proof on Wikipedia, the programming language has to be an admissible numbering. Practical and sane Turing-complate programming languages are all admissible numberings. And a Turing-complate programming language is an admissible numbering if it's possible to translate between that and another admissible numbering.
An example Turing-complete programming language that is not an admissible numbering:
The source code always contains one or two doublequoted escaped strings. If the input is empty, output the first string if there are two strings, or loop forever if there is one. Otherwise, evaluate the last string in Python, using the original input as input.
It's not an admissible numbering because, given a Python program, we have to know its behavior when the input is empty, to translate it into this language. But we may never know if it is an infinite loop, as we cannot solve the halting problem. We know a translation always exists, though.
It's impossible to write quines in this language.

Related

Are there any interpreted languages in which you can dynamically modify the interpreter?

I've been thinking about this writing (apparently) by Mark Twain in which he starts off writing in English but throughout the text makes changes to the rules of spelling so that by the end he ends up with something probably best described as pseudo-German.
This made me wonder if there is interpreter for some established language in which one has access to the interpreter itself, so that you can change the syntax and structure of the language as you go along. For example, often an if clause is a keyword; is there a language that would let you change or redefine this on the fly? Imagine beginning a console session in one language, and by the end, working in another.
Clearly one could write an interpreter and run it, and perhaps there is no concrete distinction between doing this and modifying the interpreter. I'm not sure about this. Perhaps there are limits to the modifications you can make dynamically to any given interpreter?
These more open questions aside, I would simply like to know if there are any known interpreters that allow this at all? Or, perhaps, this ability is just a matter of extent and my question is badly posed.
There are certainly languages in which this kind of self-modifying behavior at the level of the language syntax itself is possible. Lisp programs can contain macros, which allow among other things the creation of new control constructs on the fly, to the extent that two Lisp programs that depend on extensive macro programming can look almost as if they are written in two different languages. Forth is somewhat similar in that a Forth interpreter provides a core set of just a dozen or so primitive operations on which a program must be built in the language of the problem domain (frequently some kind of real-world interaction that must be done precisely and programmatically, such as industrial robotics). A Forth programmer creates an interpreter that understands a language specific to the problem he or she is trying to solve, then writes higher-level programs in that language.
In general the common idea here is that of languages or systems that treat code and data as equivalent and give the user just as much power to modify one as the other. Every Lisp program is a Lisp data structure, for example. This is in contrast to a language such as Java, in which a sharp distinction is made between the program code and the data that it manipulates.
A related subject is that of self-modifying low-level code, which was a fairly common technique among assembly-language programmers in the days of minicomputers with complex instruction sets, and which spilled over somewhat into the early 8-bit and 16-bit microcomputer worlds. In this programming idiom, for purposes of speed or memory savings, a program would be written with the "awareness" of the location where its compiled or interpreted instructions would be stored in memory, and could alter in place the actual machine-level instructions byte by byte to affect its behavior on the fly.
Forth is the most obvious thing I can think of. It's concatenative and stack based, with the fundamental atom being a word. So you write a stream of words and they are performed in the order in which they're written with the stack being manipulated explicitly to effect parameter passing, results, etc. So a simple Forth program might look like:
6 3 + .
Which is the words 6, 3, + and .. The two numbers push their values onto the stack. The plus symbol pops the last two items from the stack, adds them and pushes the result. The full stop outputs whatever is at the top of the stack.
A fundamental part of Forth is that you define your own words. Since all words are first-class members of the runtime, in effect you build an application-specific grammar. Having defined the relevant words you might end up with code like:
red circle draw
That wold draw a red circle.
Forth interprets each sequence of words when it encounters them. However it distinguishes between compile-time and ordinary words. Compile-time words do things like have a sequence of words compiled and stored as a new word. So that's the equivalent of defining subroutines in a classic procedural language. They're also the means by which control structures are implemented. But you can also define your own compile-time words.
As a net result a Forth program usually defines its entire grammar, including relevant control words.
You can read a basic introduction here.
Prolog is an homoiconic language, allowing meta interpreters (MIs) to be declined in a variety of ways. A meta interpreter - interpreting the interpreter - is a common and useful native construct in Prolog.
See this page for an introduction to this argument. An interesting and practical technique illustrated is partial execution:
The overhead incurred by implementing these things using MIs can be compiled away using partial evaluation techniques.

Why is Clojure dynamically typed?

One thing I like very much is reading about different programming languages. Currently, I'm learning Scala but that doesn't mean I'm not interested in Groovy, Clojure, Python, and many others. All these languages have a unique look and feel and some characteristic features. In the case of Clojure I don't understand one of these design decisions. As far as I know, Clojure puts great emphasis on its functional paradigm and pretty much forces you to use immutable "variables" wherever possible. So if half of your values are immutable, why is the language dynamically typed?
The Clojure website says:
First and foremost, Clojure is dynamic. That means that a Clojure program is not just something you compile and run, but something with which you can interact.
Well, that sounds completely strange. If a program is compiled you can't change it anymore. Sure you can "interact" with it, that's what UIs are used for but the website certainly doesn't mean a neat "dynamic" GUI.
How does Clojure benefit from dynamical typing
I mean the special case of Clojure and not general advantages of dynamic typing.
How does the dynamic type system help improve functional programming
Again, I know the pleasure of not spilling "int a;" all over the source code but type inference can ease a lot of the pain. Therefore I would just like to know how dynamic typing supports the concepts of a functional language.
If a program is compiled you can't change it anymore.
This is wrong. In image-based systems, like Lisp (Clojure can be seen as a Lisp dialect) and Smalltalk, you can change the compiled environment. Development in such a language typically means working on a running system, adding and changing function definitions, macro definitions, parameters etc. (adding means compiling and loading into the image).
This has a lot of benefits. For one, all the tools can interact directly with the program and do not need to guess at the system's behaviour. You also do not have any long compilation pauses, because each compiled unit is very small (it is very rare to recompile everything). The NASA JPL once corrected a running Lisp system on a probe hundreds of thousands of kilometres away in space.
For such a system, it is very natural to have type information available at runtime (that is what dynamic typing means). Of course, nothing hinders you from also doing type inference and type checks at compilation time. These concepts are orthogonal. Modern Lisp implementations typically can do both.
Well first of all Clojure is a Lisp and Lisps traditionally have always been dynamically typed.
Second as the excerpt you quoted said Clojure is a dynamic language. This means, among other things, that you can define new functions at runtime, evaluate arbitrary code at runtime and so on. All of these things are hard or impossible to do in statically typed languages (without plastering casts all over the place).
Another reason is that macros might complicate debugging type errors immensely. I imagine that generating meaningful error messages for type errors produced by macro-generated code would be quite a task for the compiler.
I agree, a purely functional language can still have an interactive read-eval-print-loop, and would have an easier time with type inference. I assume Clojure wanted to attract lisp programmers by being "lisp for the jvm", and chose to be dynamic like other lisps. Another factor is that type systems need to be designed as the very first step of the language, and it's faster for language implementors to just skip that step.
(I'm rephrasing the original answer since it generated too much misunderstanding)
One of the reasons to keep Clojure (and any Lisp) dynamically typed is to simplify creation of macros. In short, macros deal with abstract syntax trees (ASTs) which can contain nodes of many, many different types (usually, any objects at all). In theory, it's possible to make full statically typed macro system, but in practice such systems are usually limited and sparsely spread. Please, see examples below and extended discussion in the thread.
EDIT 2020: Wow, 9 years passed from the time I posted this answer, and people still add comments. What a legacy we all have left!
Some people noted in comments that having a statically typed language doesn't prevent you from expressing code as data structure. And, strictly speaking, it's true - union types allow to express data structures of any complexity, including syntax of a language. However I claim that to express the syntax, you must either reduce expressiveness, or use such wide unions that you lose all advantages of static typing. To prove this claim I will use another language - Julia.
Julia is optionally typed - you can constrain any function or struct field to have a particular type, and Julia will check it. The language supports AST as a first class citizen using Expr and Symbol types. Expression definition looks something like this:
struct Expr
head::Symbol
args::Vector{Any}
end
Expression consists of a head which is always a symbol and list of arguments which may have any types. Julia also supports special Union which can constrain argument to specific types, e.g. Symbols and other Exprs:
struct Expr
head::Symbol
args::Vector{Union{Symbol, Expr}}
end
Which is sufficient to express e.g. :(x + y):
dump(:(x + y))
Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol +
2: Symbol x
3: Symbol y
But Julia also supports a number of other types in expressions. One obvious and helpful example is literals:
:(x + 1)
Moreover, you can use interpolation or construct expressions manually to put any object to AST:
obj = create_some_object()
ex1 = :(x + $objs)
ex2 = Expr(:+, :x, obj)
These examples are not just a funny experiments, they are actively used in real code, especially in macros. So you cannot constrain expression arguments to a specific union of types - expressions may contain any values.
Of course, when designing a new language you can put any restrictions on it. Perhaps, restricting Expr to contain only Symbol, Expr and some Literals would be useful in some contexts. But it goes against principles of simplicity and flexibility in both - Julia and Clojure, and would significantly reduce usefulness of macros.
Because that's what the world/market needed. No sense in building what's already built.
I hear the JVM already has a statically typed language ;)

When you are proving a language is decidable, what are you effectively doing?

When you are proving a language is decidable, what are you effectively doing?
If you asking HOW is it done, I'm unsure, but I can check.
Basically, decidable is the language for which one can construct an algorithm (i.e. Turing machine) that will halt for ANY finite input (with accepting or rejecting the input).
Undecidable is the language which is not decidable.
http://en.wikipedia.org/wiki/Recursive_language ... but more on the subject can easily be found. On this link there is only a quick mention of the term.
p.s. So, when constructing above mentioned algorithm, you are basically proving that language is decidable.

What is a good tool for automatically calculating FIRST and FOLLOW sets?

I'm currently in the middle of playing with a BNF grammar that I hope to be able to wrangle into a LL(1) form. However, I've just finished making changes and calculating the new FIRST and FOLLOW sets for the grammar by hand for the third time today and I'm getting tired of it. There has to be a better way!
Can someone suggest a tool that, given a grammar, will automatically calculate the first and follow sets for all of the non terminals?
A year ago, we had a semester project at the university I attend, where our task was to create a programming language. As a group, we decided we wanted to be able to hand-write the parser from scratch, so we had to aim for an LL(1) grammar, since it would have been completely unrealistic to write a parser otherwise.
Of course, our starting point was far from being LL(1), so we too had to wrangle it into place. For that purpose, we used the kfgEdit tool from the AtoCC package. All you do is enter your rules, and then it can check if it's an LL(1) grammar at the click of a button.
A fair word of warning: The tool is a bit finicky about what it accepts. While you'd often use EBNF for the real grammar, so you can write ? and * and + to signal how many times that token must appear there, this is not supported. Grouping is also not supported. You may very well find that it takes a very long time to do this, and you will almost surely want to do some "rearranging" after you've reached LL(1) to make the grammar even close to being readable.
Of course, depending on the type of grammar you're dealing with, this may not be much of a problem for you. We had created a sort of Pascal/C hybrid, with a fairly restricted set of constructs (procedures, functions, only built-in primitive types and arrays of them, ifs, a single loop construct we'd come up with ourselves in place of the standard 3...), and it took me at least a week to wrangle it into an LL(1) grammar - probably 2, actually. Note that this is out of a total of about 4 months, so that was a lot of time spent there.
If you absolutely MUST have an LL(1) grammar, then you obviously will need to press on if you get into a situation like this, but if you're allowed to use parser generators like yacc/bison or SableCC then you will, in the long run, most likely find it a LOT easier to go down that route. That doesn't mean you SHOULD go down that route - I found that actually writing everything by hand provided some insight I probably wouldn't have gained otherwise - but it might be better for you to gain that insight in a different situation than your current.
tl;dr version: Use kfgEdit from the AtoCC package.
For recursive descent parsing, it would be worth looking at ANTLR. However, I'm not sure it provides an exact answer for your question - find the FIRST and FOLLOW sets for a given grammar.
The DMS Software Reengineering Toolkit has a parser generator that computes FIRST and FOLLOW sets; it will also let you inspect the L(AL)R state machine it generates.
However, if you have a legitimate context-free grammar, you don't have to "wrangle it" into LL shape; the DMS parser generator produces GLR parsers from any context-free grammar.

Programming languages that define the problem instead of the solution?

Are there any programming languages designed to define the solution to a given problem instead of defining instructions to solve it? So, one would define what the solution or end result should look like and the language interpreter would determine how to arrive at that result. Looking at the list of programming languages, I'm not sure how to even begin to research this.
The best examples I can currently think of to help illustrate what I'm trying to ask are SQL and MapReduce, although those are both sort of mini-languages designed to retrieve data. But, when writing SQL or MapReduce statements, you're defining the end result, and the DB decides the best course of action to arrive at the end result set.
I could see these types of languages, if they exist, being used in crunching a lot of data or finding solutions to a set of equations. The dream language would be one that could interpret the defined problem, identify which parts are parallelizable, and execute the solution across multiple processes/cores/boxes.
What about Declarative Programming? Excerpt from wikipedia article (emphasis added):
In computer science, declarative
programming is a programming paradigm
that expresses the logic of a
computation without describing its
control flow. Many languages
applying this style attempt to
minimize or eliminate side effects by
describing what the program should
accomplish, rather than describing how
to go about accomplishing it. This
is in contrast with imperative
programming, which requires an
explicitly provided algorithm.
The closest you can get to something like this is with a logic language such as Prolog. In these languages you model the problem's logic but again it's not magic.
This sounds like a description of a declarative language (specifically a logic programming language), the most well-known example of which is Prolog. I have no idea whether Prolog is parallelizable, though.
In my experience, Prolog is great for solving constraint-satisfaction problems (ones where there's a set of conditions that must be satisfied) -- you define your input set, define the constraints (e.g., an ordering that must be imposed on the previously unordered inputs) -- but pathological cases are possible, and sometimes the logical deduction process takes a very long time to complete.
If you can define your problem in terms of a Boolean formula you could throw a SAT solver at it, but note that the 3SAT problem (Boolean variable assignment over three-variable clauses) is NP-complete, and its first-order-logic big brother, the Quantified Boolean formula problem (which uses the existential quantifier as well as the universal quantifier), is PSPACE-complete.
There are some very good theorem provers written in OCaml and other FP languages; here are a whole bunch of them.
And of course there's always linear programming via the simplex method.
These languages are commonly referred to as 5th generation programming languages. There are a few examples on the Wikipedia entry I have linked to.
Let me try to answer ... may be Prolog could answer your needs.
I would say Objective Caml (OCaml) too...
This may seem flippant but in a sense that is what stackoverflow is. You declare a problem and or intended result and the community provides the solution, usually in code.
It seems immensely difficult to model dynamic open systems down to a finite number of solutions. I think there is a reason most programming languages are imperative. Not to mention there are massive P = NP problems lurking in the dark that would make such a system difficult to engineer.
Although what would be interesting is if there was a formal framework that could leverage human input to "crunch the numbers" and provide a solution, perhaps imperative code generation. The internet and google search engines are kind of that tool but very primitive.
Large problems and software are basically just a collection of smaller problems solved in code. So any system that generated code would require fairly delimited problem sets that can be mapped to more or less atomic solutions.
Lisp. There are so many Lisp systems out there defined in terms of rules not imperative commands. Google ahoy...
There are various Java-based rules engines which allow declarative programming - Drools is one that I've played with and it seems pretty interesting.
A lot of languages define more problems than solutions (don't take this one seriously).
On a serious note: one more vote for Prolog and different kinds of DSLs designed to be declarative.
I remember reading something about computation using DNA back when I was in college. You would put segments of DNA in a solution that represented segments of the problem, and define it in such a way that if the DNA fits together, it's a valid solution. Then you let the properties of chemicals solve the problem for you and look for finished strands that represent a solution. It sounds sort of like what you are refering to.
I don't recall if it was theoretical or had been done, though.
LINQ could also be considered another declarative DSL (aschewing the argument that it's too similar to SQL). Again, you declare what your solution looks like, and LINQ decides how to find it.
The beauty of these kinds of languages is that projects like PLINQ (which I just found) can spring up around them. Check out this video with the PLINQ developers (WMV direct link) on how they parallelize solution finding without modifying the LINQ language (much).
While mathematical proofs don't constitute a programming language, they do form a formal language where you simply define solutions (as long as you allow nonconstructive proofs). Of course, it's not algorithmic, so "math" might not be an acceptable answer.
Meta Discussion
What constitutes a problem or a solution is not absolute and depends on the level of abstraction that you are taking as a reference point.
Let's compare the following 3 languages: SQL, C++, and CPU instructions.
C++ vs CPU instructions
If you choose array manipulation as the desired level of abstraction, then C++ allows you to "define the problem" instead of the solution:
array[i * 2 + 3] = 5;
array[t] = array[k - m] - 1;
Note what this C++ snippet does not state: how the memory is laid out, how many bits are used by each array element, which CPU registers hold the data, and even in which order the arithmetic operations will be performed (as long as the result is the same).
The C++ compiler, however, will translate this code to lower-level CPU instructions that will contain all of these details.
At the abstraction level of array manipulation, C++ is declarative, and CPU instructions are imperative.
SQL vs C++
If you choose a sorting algorithm as the desired level of abstraction, then SQL allows you to "define the problem" instead of the solution:
select *
from table
order by key
This snippet of code is declarative with respect to the sorting algorithm's level of abstraction because it declares that the output is sorted without using lower-level concepts (like array manipulation).
If you had to sort an array in C++ (without using a library), the program would be expressed in terms of array manipulation steps of a particular sorting algorithm.
void sort(int *array, int size) {
int key, j;
for(int i = 1; i < size; i++) {
key = array[i];
j = i;
while(j > 0 && array[j-1] > key) {
array[j] = array[j-1];
j--;
}
array[j] = key;
}
}
This snippet is not declarative with respect to the sorting algorithm's level of abstraction because it uses concepts (such as array manipulation) that are constituents of the sorting algorithm.
Summary
To summarize, whether a language defines problems or solutions depends on what problems and solutions you are referring to.
Many answers here have brought up examples: SQL, LINQ, Prolog, Lisp, OCaml. I am sure there are many useful levels of abstractions with respect to which these languages are declarative.
However, do not forget that you can build a language with an even higher level of abstraction on top of them.