Lucene in Java, C#.Net and C++. Which is the best version for long-term use on Windows server? - lucene

I am going to implement Lucene search into my project and I want to make a best start.
So I consider between 3 versions of Lucene (Java/C#.Net/C++) which is the best version upon these criterias :
1.performance
2.easy to implement
3.plenty of documents ?
Assume the system is Window server, and I ask it for a long-term use.
Thanks

I would say Java. Lucene was initially developed in Java and I would think there are bigger community, more documentation and bigger deployments using Java.
Granted, Windows is not usually considered as primary platform for deploying Java services but it still would work with flying colors. Many people using Windows for Java development and even deployment so I don't expect any major issues.

Unless you've got a specific feature you need, I would look at best being:
a) Whatever platform you are developing the program in -- there are lots of advantages to not having to switch tools/contexts/platforms to muck around with the search internals.
b) Whatever platform your ops guys want to deal with -- I know lots of windows ops guys hate dealing with java as it is a strange foreign language. For example.
c) All of the above being equal, Java is the real flagship lucene project that everyone else is keeping up with with and that has the most tools & resources. It is the way to go if you don't have any reason not to use java. Solr is another advantage here -- you can pretty easily use a pre-wrapped fully functional lucene http server.
In any case, keep in mind that at least theoretically any lucene index written on one platform is readable by others so you don't necessarily have to fully commit to a single platform.

Related

Webkit vs Processing for Interactive Applications

I know this sounds a little bizarre, but there is a very simple application I want to write, a sort of unique image viewer, which requires some interactivity with the host system at the user level. Simplicity when developing is a must as this is a very small side project. The project does require some amount of graphical work and quite a bit of mouse based interactivity (as well as some keyboard shortcuts), but quite frankly, I don't want to dig my hands into OGL for something this small. I looked at the available options, and I think I've narrowed it down to two main choices: Webkit (through either QtWebkit or WebkitGtk), and the language Processing.
Since I haven't actually used Processing but I do have some amount of HTML5 canvas and Javascript experience, I am somewhat tempted to using a Webkit based solution. There are however, several concerns I have.
How is Webkit's support for canvas, specifically for more graphically intensive processes?
I've heard that bridging is handled better in QtWebkit than WebkitGtk. Is this still true?
To what degree can bridging actually do? Can a Webkit based application do everything that an application which interacts with the files on the system needs?
Looking at Processing, there are similarly, a couple things I'm wondering.
Processing is known for its graphical capabilities, but how capable is it for writing a general everyday desktop application?
There are many sources that link Processing to Java, both in lineage as well as in distributing applications over the web (ie: JApplets). Is the "Application Export" similarly closely integrated with Java?
As for directly comparing the two, the main concern I do have is the overhead of each. I want the application to start up as snappy as possible, and I know that Java has a bit of an overhead regarding start up because it first has to start up the interpreter. How do Processing and QtWebkit/WebkitGtk compare for start up?
Note that I am targeting the Linux platform only.
Thanks!
It's difficult to give a specific answer, because you're actually asking a few different kind of questions - and some of them you could be more precise.
Processing is a subset or child of java - it's really "just" a java framework with an free ide that hides the messy setup work of building an applet, so that a user can dive in and write something quickly without getting bogged down in widgets and ui, etc. So processing can exist by itself and the end user needs to know nothing of Java (except syntax - processing is java, so the user must learn java syntax).
But a programmer who already knows java can exploit the fun quick nature of processing and then leverage their normal java experience for whatever else is needed - everything of java is in processing, just a maybe slightly hidden (but only at first) It's also possible to import the processing.jars into an existing java program and use them there. See http://processing.org/learning/eclipse/ form more information.
"how capable is it for writing a general everyday desktop application?" - Not particularly on it's own (it's not made to be), but some things are possible and easy (i.e. file saving & loading, non-standard gui, etc.), and in some ways it's similar to old school actionscript or lingo. There is a library called controlP5 that makes gui stuff a bit easier.
Webkit is another kettle of fish, especially if you aren't making a web-based thing (it sounds like you're thinking on using the webkit libraries as part of a larger program. I'll admit I don't have the dev expertise with those specific libraries to give you the answer you really want, but I'm pretty certain that unless you have programming experience beyond html5/javascript you'll probably get going much faster with processing.
Good luck with whichever path you choose!

Best language for cross-platform logic engine?

I need to write a logic engine for an application. Essentially, this thing is going to be fed a bunch of data in an XML file, and it then crunches that data and produces an XML file as its result.
The trick is that this engine will need to run on a server (probably Windows, and probably as a background service) AND it will need to on mobile devices - iOS and Android, primarily.
The logic isn't that awfully difficult or complex. On the mobile devices, the idea is to give researchers quick-and-dirty access to the engine for very tiny data sets. The server "version" will do exactly the same work, but do it on huge data sets.
The GUI will be abstracted from this logic engine.
I should point out that the "mobile version" should be able to work offline - meaning that whatever I choose to implement this logic engine in, it needs to run natively on the devices. THAT said, it's perfectly fine for it to run in the mobile device's local Web browser in a locally-stored file. For example, I'd originally considered JavaScript for this - except I don't think there's a way to have JavaScript running in a multi-threaded service on the server side of things.
Is there a single language that offers to do this? With a minimum of re-coding between platforms?
You can use Rhino to execute JavaScript from inside a Java server/servlet. I'm not sure how parallel/threaded the engine is. You can also look into hosting Google V8, which probably will be higher performance/more scalable.
I don't think you can do all of this (you probably can, but it wouldn't be very pretty) in one language.
Java (or another JVM language like Scala, Clojure or Groovy) is the closest you can do: it's the single platform that allows compiled code to be run unchanged on the largest range of platforms.
However I'm not sure how good Java support is on iOS - this might be the tricky one. But this is going to be a problem in any case: Apple don't seem particularly keen on encouraging anything other than their own tools.
Perhaps the best strategy is to write in Java (which will cover your servers and 95% of your client platforms) and then have a small client side portion that you can quickly port for special cases like iOS.

What are alternatives to the Java VM?

As Oracle sues Google over the Dalvik VM it becomes clear, that you cannot implement a Java VM without license from Oracle (EDIT: Matthew Flaschen points out, that the claims of Oracle may not be valid. Anyways we have currently a situation, where Oracle threats VM-implementations.). That may become the death for Open-Source-implementations of Java (like Apache Harmony).
I don't want to discuss the impact or the legitimation of this lawsuit. but as a Java-programmer I want to take a deeper look into the alternatives, to be prepared for every case. As I see the creation of a compiler as a minor problem, my main interest are alternative VM-implementations, that serve a similar purpose as the JVM.
The VM I'm looking for, should meet some conditions:
free of patent-issues
an Open-Source-implementation exists
potential for optimizations/good performance
platform independent (the VM can be ported to different platforms without bigger hurdles)
Please add some recommendations for me.
LLVM is a really good optimizing, low level virtual machine. It can support languages like C and C++, and does not have built in support for high level features like garbage collection.
VMKit is an implementation of the Java and CLI virtual machines on top of LLVM. Since it uses Java bytecode, this probably wouldn't help with the patent issues.
HLVM is another interesting high level virtual machine built on top of LLVM. It is probably different enough to avoid most well known patents, but it is mainly targeted at numerical computing and functional programming.
On the dynamically typed side, there is Parrot.
I am actually working on a compiler and VM for a language of my own design, but don't count on it ever being finished. ;-)
Keep in mind that any large piece of software will infringe on numerous patents, the important thing is how well known they are (and how much the patents' owners actively seek out infringers). Of course, the whole patent system is absurd, and we would be much better off getting rid of it.
I don't think there is any significant piece of software that is free from patent issues.
If you are an independent developer or working for a smaller company you probably won't get hit directly by the problems though. It's unlikely that big companies holding patents will go after lots of small claims - it's an expensive process and causes a lot of resentment. SCO tried something like that and it didn't work out too well for them.
I would concentrate on finding the best tool for the job without worrying too much about the patent issues, otherwise you will never get anything done.
GraalVM is a research project developed by Oracle Labs and already in production at Twitter. I can't believe my eyes that no one mentions anything about it, it’s so weird. Anyways, GraalVM is a well promising extension of the java virtual machine to support more language and execution modes for running applications like JavaScript, Python, Ruby, R, JVM-based languages, and LLVM-based languages such as C and C++.The GraalVM project includes a new high-performance Java compiler, itself called Graal, which can be used in a just-in-time configuration on the HotSpot VM, or in an ahead-of-time configuration on the SubstrateVM. The main goal of this project is to improve the performance of the java virtual machine base language to match the performance of native languages. Let’s sum up the novel features that this project offers and make a brief explanation according to the docs why you should adopt it.
Polyglot: All languages (even LLVM-based) share the same VM and its capabilities. Zero overhead interoperability between programming languages allows you to write polyglot applications and select the best language for your task
Native: Native images compiled with GraalVM ahead-of-time improve the startup time and reduce the memory footprint of JVM-based applications.
Embeddable: GraalVM can be embedded in both managed and native applications. There are existing integrations into OpenJDK, Node.js, Oracle Database, and MySQL GraalVM removes the isolation between programming languages and enables interoperability in a shared runtime. It can run either standalone or in the context of OpenJDK, Node.js, Oracle Database, or MySQL.
Performance: Graal benchmark reports show great performance improvements in almost all of its implementations thanks to the way that GraalVM performs object allocations
If someone don’t get convinced by now that is a good choice and it is a really awesome project you can see this talk by Christian Thalinger on “on why Graal is a good fit for Twitter”

Limitations of XUL

I'm trying to understand if it is worth the pain to learn XUL more thoroughly.
If you have experience with a moderately complex project (like an independent application rather than a Firefox extension), can you tell me what your experience has been like?
I am particularly worried for feature which are not supported by the XUL framework natively. There are two possibilities: either create more XPCOM components, or using external tools. The latter approach is not completely satisfactory, as interprocess communication seems somehow lacking in XUL.
On the other hand, I have no knowledge of C++. How difficult would it be for a first time learner to wrap an existing library in XPCOM dressing?
I have not written any XPCOM in my three years of developing XUL applications. It does seem intimidating. So far, though, I haven't had a good reason to create any XPCOM. I do use some external tools - for reporting, working with mobile devices, etc. I eventually figured out that you can at least get the STDOUT return value from a process that runs (at least on Windows, it seems that this particular feature might not be consistent across platforms). That allowed me to have at least a single return value, which allowed me to implement error handling.
I think that you will find that you can do quite a bit without touching XPCOM. However, everything is not polished and easy, and there is not a large, helpful, developer community/ not much developer support, so it can be a frustrating learning experience.
If this is a large application, or an application that you might be adding other developers too, you may wish to consider choosing a more supported development platform.

When is it good to use embedded script language like Lua

I'm playing WoW for about 2 years and I was quite curious about Lua which is used to write addons. Since what I've read so far about Lua was "fast", "light" and "this is great", I was wondering how and when to use it.
What is the typical situation where you will need to embed a script language like Lua in a system ?
When you need end users to be able to define/change the system without requiring the system to rewritten. It's used in games to allow extensions or to allow the main game engine to remain unchanged, while allow content to be changed.
Embedded scripting languages work well for storing configuration information as well. Last I checked, the Mozilla family all use JavaScript for their config information.
Next up, they are great for developing plugins. You can create a custom API to expose to the plugin developers, and the plugin developers gain a lot of freedom from having an entire language to work with.
Another is when flat files aren't expressive enough. If you want to write data driven apps where behavior is parameterized, you'll get really tired of long strings of conditionals testing for config combinations. When this happens, you're better off writing the rules AND their evaluation into your config.
This topic gets some coverage in the book Pragramtic Programmer.
Lua is:
Lightweight
Easy to integrate, even in an asynchronized environment such as a game
Easy to learn for non-programmer staff such as integrators, designers and artists
Since games usually require all those qualities, Lua is mostly used there. Other sitation could be any application that needs some scripting functionality, but developers often opt for a little more heavy weight solution such as .Net or python.
In addition to the scripting and configurability cases mentioned, I would simply state that Lua+C (or Lua+C++) is a perfect match for any software development. It allows one to make an engine/usage interface where engine is done in C/C++ and the behaviour or customization done in Lua.
OS X Cocoa has Objective-C (C and Smalltalk amalgam, where language changes by the line). I find Lua+C similar, only the language changes by a source file, which to me is a better abstraction.
The reasons why you would not want to use Lua are also noteworthy. Because it hardly has a good debugger. Then again, people hardly seem to need one either. :)
a scripting language like Lua can also be used if you have to change code (with immediate effect) while the application is running. one may not see this in wow, because as far as i remember the code is loaded at the start (and not rechecked and reloaded while running).
but think of another example: webserver and scripting language - (thankfully) you can change your php code without having to recompile apache or restart apache.
steve yegge did that thing for his own mmorpg engine powering wyvern, using jython or rhino and javascript (can't remember). he wrote the core engine in java, but the program logic in python/javascript.
the effect of this is:
he doesn't have to restart the core engine when changing the scripts, because that would disconnect all the players
he can let others do the simpler programming like defining new items and monsters without exposing all the critical code to them
sandboxing: if an error happens inside the script, you may be able to handle it gracefully without endangering the surrounding application
Rapid development for application with real-time constraints. Computer games are one of these ;-)
It's a valid solution if you want to allow third parties to develop plug-ins or mods for your software.
You could implement an API in whatever language you are using, but a script language like LUA tends to be more simple and accessible for casual developers.
In addition to all the excellent reasons mentioned by others, Embedding Lua in C is very helpful when you need to manipulate text, work with files, or just need a higher level language. Lua has lots of nifty feature (Tables, functions are first class values, lots of other good stuff). Also, while lua isn't as fast as C or C++, it's pretty quick for an interpreted language.