I've just installed spacemacs. Of four IBus Input Methods spacemacs likes English and Russian. Ancient Greek and Church Slavonic is seen as English. Needless to say that outside spacemacs everything is OK. There is a lot of know how about Chinese and other East Asian languages, which is understandable. Unfortunately, exotic European languages command little attention.
How am I to go about troubleshooting this?
Related
Lets say I want to have some English text spoken in an Italian accent.
Many of the engine demos I have tried on their respected sites will have the Italian language available, but when you try to get it to pronounce a few sentences in English, they often become highly unintelligible because they are operating by a different phoneme.
There are phoneme tags in SSML, and I know one site that allows you to actually demo with SSML. I try putting in this common and generic Italian conversation into their Italian voice:
Mama mia! Princess Peach and my friends have been kidnapped?
Chase Bowser, so we can eat some spaghetti!
And it is fairly unintelligible. Utilizing SSML or something else; Can I keep the accent, but correct the speech phoneme enough to make it intelligible?
You can hire a voice-talent with Italian accent and make a new TTS model where such option is available. Even with a several hours of speech you can get a decent model.
The second option is speech morphing, but it requires some efforts as well as knowledge in the domain.
I am wondering if there is a tool or technique which, given a BNF grammar, adjusts it randomly(but intelligently) and generates a stream of output for use in detecting cases that slip past the BNF (but shouldn't).
edit: Fuzz testing a parser, in other words.
Thanks
Spending some tender time with Google, I found that automated grammar-based fuzz testing is hard, and a subject of current research. In particular, P. Godefroid at Microsoft Research is working on a piece of software called SAGE.
I dug up a research paper by him.
Automated Whitebox Fuzz Testing (joint work with Michael Y. Levin and David Molnar) Proceedings of NDSS'2008 (Network and Distributed Systems Security), pages 151-166, San Diego, February 2008.
I also found the XML-based Peach software, but it is unclear to me on a casual reading how I might leverage it in an afternoon of work for a non-security application.
So my conclusion is: "It's a subject of current (Apr '10) research and there's no quick-use tool out there".
Not strictly a BNF fuzzing tool, but american fuzzy lop employs artificial intelligence methods and can walk around the lack of BNF knowledge quite well. It already found bugs in many open source parsers, so it might be the right tool for yours as well.
I'd like to create a new and open sourced language.
Since it's really rare to find programmers that actually dealt with compiler theory I need some advice.
How would you make a person interested in your open source project?
How do you bring him to a position where he wants to contribute?
Is there a special place where I can find those pepole (except sourceforge.net)?
It will be very hard to get people interested in your project. History has shown that 99% (at a conservative estimate) of new programming languages are only ever used by their designer. So if you do it, do it for love and don't expect much if any outside interest.
You may want to spend some time lurking on sites like, say, Lambda The Ultimate and reading up on theory of programming languages, compiler design, etc. I've heard that Essentials of Programming Languages by Friedman et al is a good intro text for the former, while you can't go wrong with the "Dragon Book" for the latter (whose official title escapes me at the moment... by Aho et al though).
take a look at Haskell (and its supporting community)
http://www.haskell.org/
I've used Haskell to model a small OO programming language in grad school and it seemed to be a common tool used in the Academia for designing programming language
BTW, this doesn't answer your questions but these two Microsoft / Codeplx projects both sparked my interest as possible starting points for creating a new language:
Dynamic Language Runtime
Common Compiler Infrastructure
I'm now a couple of months into my Smalltalk learning voyage. I was aware, from the beginning that Smalltalk has several "dialects" (perhaps "dialect" isn't the best word) but by this I mean VisualWorks, Squeak and Dolphin to mention just three. So far I have limited my foray to Visualworks and Squeak. But I've now discovered that Squeak seems to be metamorphosing (pun intended!) into several other variants e.g. Tweak, Pharo, Cobalt and Croquet.
Could somebody explain:
a) why these initiatives (Tweak, Pharo, Croquet and Cobalt) have arisen ?
b) should I take time keep abreast - bearing in mind I'm a Smalltalk neophyte?
c) How come such an unpopular language has such a vibrant set of developments happening?
d) Are there other initiatives that I should be aware of? (as a beginner not a computer researcher that is)
As far as 'How come such an unpopular language has such a vibrant set of developments happening?', I have to say that 'popularity' does not correlate with utility or productivity. A contrarian will tell you that the majority is always wrong.
When you get bitten by the Smalltalk bug, you tend to stay bitten. There are many former Smalltalkers who are earning their living working in other languages that miss the language and would jump at the opportunity to earn their living in Smalltalk again.
This phenomenon accounts for the vibrant community.
Personally, I find that I am at my most productive working in Smalltalk. The tools and the language work together to make the gap between idea and execution very small. In Smalltalk when I am faced with using an new library, I can use the debugger to 'parachute' into the middle of the action - viewing state and code in a single tool. You can't duplicate that experience by reading code and studying log files...
Smalltalk has its quirks and the quirks do keep Smalltalk out of the mainstream. But some of the quirks are what make Smalltalk a productive environment to work in, which may mean that it will never be mainstream.
But with a vibrant and active community supporting Smalltalk (in a variety of dialects) does it matter whether Smalltalk is mainstream or not?
A bit of background might be helpful: Tweak was a research effort trying to bring some of the great things from Etoys to the system level (i.e., the player-costume architecture, the concurrency model, "events everywhere", asynchronous notifications etc). Tweak was a "blue-plane" approach to graphics, composition and scripting and in some ways never really intended to be a production tool. That it became one was its downfall because it wasn't polished enough for wide use and by becoming a production tool it became infeasible to implement some of the radical changes that would have been required to make it ready for world dominance ;-)
Croquet had an entirely different goal. We needed Croquet because we needed a bit-identical replicated computation machinery. Croquet computes bit-identically on all platforms which required modifications to the virtual machine and some libraries (such as floating point). Cobalt is a spin-off from Croquet which takes the SDK and builds an application from it. In this sense Cobalt is not really a fork - it is the current focus of the Croquet community.
I don't know about the other initiatives you mention, but Pharo is a fork which aims at producing a version of Squeak without all the cruft (like EToys, for example), better developer support and use of modern (??) technologies like TrueType fonts. It's well worth downloading the current image and having a look - I find it a bit slow on my ancient hardwate, but I intend to keep an eye on it.
This just shows how an inspiring language Smalltalk is and how sound and cleverly designed roots it has. It inspires people from academia to industry to try to extend and build new "dialects", which are then usually merged to some extend among themselves so that at the end we all profit.
That's why I like Smalltalk and its community/communities, even that sometimes you feel tensions there. But every progress needs a tension first.
Pharo is a result of such tension for instance. Pharo is a fork of Squeak, by group of Squeakers with a strong leadership and work more/talk less mentality, which already show the results and it will for sure move Squeak if not all Smalltalk a step further.
I think there are that initiatives or forks because the community is able to do it :) This small smalltalk community is stuffed up with smart guys that know what they do. There is enough knowledge about virtual machines, language design and such.
On the other hand it is like every other community, too. There are people with different opinions. So it is only a matter of time until a few people start "something slightly different" to check/realize their ideas. And they do because they can.
Does anyone know where to find good online resources with examples of how to make grammars and parse trees? Preferably introductory materials.
Info that is n00b friendly, haven't found anything good with Google myself.
Edit: I'm thinking about theory, not a specific parser software.
Not online, but maybe you should take a look at Compilers: Principles, Techniques, and Tools (2nd Edition) by Aho et al. This is a standard text that has been evolving for 30 years (if you count the 1st Dragon Book, published in 1977
Well, here's where I learned it...
http://www.cs.uiuc.edu/class/sp08/cs273/
Click on the lectures tag, scroll through till you find the lectures on the material you are talking about.
Love my alma mater. God bless them, they never take down their lectures in any class and you can go and read any of them anytime you want.
edit: Looks like you want lecture11
Antlr?
http://www.antlr.org/
Has a quite good IDE for designing a grammar, and a lot of generators for different languages.
www.goldparser.com
The tools are free and good to work on. It has technical and theoretical tutorials, lots of info, tools and code generators for many langs.
in C,C++ use lex and bison
in java use ANTLR
this is a beautiful antlr video tutorial