Related
From OptaPlanner 8.17, it seems that the code of task assigning example project has been refactored a lot. I didn't succeed in finding in the release notes nor on Github any comment about these changes.
In particular, the implementation of the problem to solve doesn't involve chained variables anymore since this version. Could someone from the OptaPlanner team explain why ? I'm also a bit confuse because the latest version of the documentation related to this example project is still referencing the previous deleted classes from version before 8.17 (eg.
org/optaplanner/examples/taskassigning/domain/TaskOrEmployee.java).
It's using #PlanningListVariable, an new (experimental) alternative to chained planning variables, which is far easier to understand and maintain.
Documentation for this new feature hasn't been written yet. We're finishing up the ListVariableListener interface and then the documantation will be updated to cover #PlanningListVariable too. At that time, it will be ready for announcement.
Unlike a normal feature, this big, complex feature took more than a year to bake. That's why it's been delivered in portions. One could argue the task assignment example shouldn't have escaped the feature branch, but it was proving extremely expensive to not merge the stable feature branches in sooner rather than later.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I saw this ReasonML vs TypeScript question here at StackOverflow, now I'm wondering how ReasonML and Elm compare to each other.
What are their similarities and differences? Which one should I use when?
What's the advantage of one over the other?
I'm not intimately familiar with Elm, but I've looked into it a bit and I'm pretty familiar with Reason, so I'll give it a shot. I'm sure there will be inaccuracies here though, so please don't take anything I say as fact, but use it instead as pointers for what to look into in more detail yourself if it matters to you.
Both Elm and Reason are ML-like languages with very similar programming models, so I'll focus on the differences.
Syntax:
Elm uses a Haskell-like syntax that is designed (and/or evolved) for the programming model both Elm and Reason uses, so should work very well for reading and writing idiomatic code once you`re familiar with it, but will seem very different and unfamiliar to most programmers.
Reason tries to be more approachable by emulating JavaScript's syntax as much as possible, which will be familiar to most programmers. However it also aims to support the entire feature set of the underlying OCaml language, which makes some functional patterns quite awkward.
One example of this is the function application syntax, which in Elm emphasizes the curried nature of functions (f a b) and works very well for composing functions and building readable DSLs. Reason's parenthesized syntax (f(a, b)) hides this complexity, which makes it easier to get into (until you accidentally trip on it, since it's of course still different underneath), but makes heavy use of function composition a mess of parentheses.
Mutability:
Elm is a purely functional language, which is great in theory but challenging in practice since the surrounding world cares little about Elm's quest for purity. Elm's preferred solution to this, I think, is to isolate the impurity by writing the offending code in JavaScript instead, and to then access it in Elm through either web components or ports. This means you might have to maintain significant amounts of code in a separate and very unsafe language, quite a bit of boilerplate to connect them, as well as having to figure out how to fit the round things though the square holes of ports and such in the first place.
Reason on the other hand is... pragmatic, as I like to call it. You sacrifice some safety, ideals and long-term benefits for increased productivity and short-term benefits. Isolating impurity is still good practice in Reason, but you're inevitably going to take short-cuts just to get things done, and that is going to bite you later.
But even if you do manage to be disciplined enough to isolate all impurity, you still have to pay a price to have mutation in the language. Part of that price is what's called the value restriction, which you're going to run into sooner or later, and it's going to both confuse and infuriate you, since it will reject code that intuitively should work, just because the compiler is unable to prove that there can`t at some point be a mutable reference involved.
JavaScript interoperability:
As mentioned above, Elm provides the ability to interoperate with JavaScript through ports and web components, which are deliberately quite limited. You used to be able to use native modules, which offered much more flexibility (and ability to shoot yourself in the foot), but that possibility is going away (for the plebs at least), a move that has not been uncontroversial (but also shouldn't be all that surprising given the philosophy). Read more about this change here
Reason, or rather BuckleScript, provides a rich set of primitives to be able bind directly to JavaScript, and very often produce an idiomatic Reason interface without needing to write any glue code. And while not very intuitive, it's pretty easy to do once you grok it. It's also easy to get it wrong and have it blow up in your face at some random point later, however. Whatever glue code you do have to write to provide a nice idiomatic API can be written in Reason, with all its safety guarantees, instead of having to write unsafe JavaScript.
Ecosystem:
As a consequence of Elm's limited JavaScript interoperability, the ecosystem is rather small. There aren't a whole lot of good quality third party JavaScript libraries that provide web components, and doing it yourself takes a lot of effort. So you'll instead see libraries being implemented directly in Elm itself, which takes even more effort, of course, but will often result in higher quality since they're specifically designed for Elm.
Tooling:
Elm is famous for its great error messages. Reason to a large degree does not, though it strives to. This is at least partly because Reason is not itself a compiler but instead built on top of the OCaml compiler, so the information available is limited, and the surface area of possible errors very large. But they're also not as well thought through.
Elm also has a great packaging tool which sets everything up for you and even checks whether the interface of a package you're publishing has changed and that the version bump corresponds to semantic versioning. Resaon/BuckleScript just uses npm and requires you to manage everything Reason/BuckleScript-specific manually like updating bsconfig.json with new dependencies.
Reason, BuckleScript, its build system, and OCaml are all blazing fast though. I've yet to experience any project taking more than 3 seconds to compile from scratch, including all dependencies, and incremental compilation usually takes only milliseconds (though this isn't entirely without cost to user-friendliness). Elm, as I understand it, is not quite as performant.
Elm and Reason both have formatting tools, but Reason-formatted code is of significantly poorer quality (though slowly improving). I think this is largely because of the vastly more complex syntax it has to deal with.
Maturity and decay:
Reason, being built on OCaml, has roots going back more than 20 years. That means it has a solid foundation that's been battle-tested and proven to work over a long period of time. Furthermore, it's a language largely developed by academics, which means a feature might take a while to get implemented, but when it does get in it's rock solid because it's grounded in theory and possibly even formally proven. On the downside, its age and experimental nature also means it's gathered a bit of cruft that`s difficult to get rid of.
Elm on the other hand, being relatively new and less bureaucratically managed, can move faster and is not afraid of breaking with the past. That makes for a slimmer and more coherent, but also has a less powerful type system.
Portability:
Elm compiles to JavaScript, which in itself is quite portable, but is currently restricted to the browser, and even more so to the Elm Architecture. This is a choice, and it wouldn't be too difficult to target node or platforms. But the argument against it is, as I understand it, that it would divert focus, thereby making it less excellent at its niche
Reason, being based on OCaml, actually targets native machine code and bytecode first and foremost, but also has a JavaScript compiler (or two) that enables it to target browsers, node, electron, react native, and even the ability to compile into a unikernel. Windows support is supposedly a bit sketchy though. As an ecosystem, Reason targets React first and foremost, but also has libraries allowing the Elm Architecture to be used quite naturally
Governance:
Elm is designed and developed by a single person who's able to clearly communicate his goals and reasoning and who's being paid to work on it full-time. This makes for a coherent and well-designed end product, but development is slow, and the bus factor might make investment difficult.
Reason's story is a bit more complex, as it's more of an umbrella name for a collection of projects.
OCaml is managed, designed and developed in the open, largely by academics but also by developers sponsored by various foundations and commercial backers.
BuckleScript, a JavaScript compiler that derives from the OCaml compiler, is developed by a single developer whose goals and employment situation is unclear, and who does not bother to explain his reasoning or decisions. Development is technically more open in that PRs are accepted, but the lack of explanation and obtuse codebase makes it effectively closed development. Unfortunately this does not lead to a particularly coherent design either, and the bus factor might make investment difficult here as well.
Reason itself, and ReasonReact, is managed by Facebook. PRs are welcomed, and a significant amount of Reason development is driven by outsiders, but most decisions seem to be made in a back room somewhere. PRs to ReasonReact, beyond trivial typo fixes and such, are often rejected, probably for good reason but usually with little explanation. A better design will then typically emerge from the back room sometime later.
I'm considering adding the PH-tree to ELKI.
I couldn't find any tutorials for examples for that, and the internal architecture is not fully obvious to me at the moment.
Do you think it makes sense to add the PH-tree to ELKI?
How much effort would that be?
Could I get some help?
Does it make sense to implement only an in-memory version, as done for the kd-tree (as far as I understand)?
Some context:
The PH-tree is a spatial index that was published at SIGMOD'14: paper, Java source code is available here.
It is a bit similar to a quadtree, but much more space efficient, doesn't require rebalancing and scales quite well with dimensionality.
What makes the PH-tree different from the R*-Tree implementations is that there is no concept of leaf/inner nodes, and nodes don't will not directly map to pages. It also works quite well with random insert/delete (no bulk-loading required).
Yes.
Of course it would be nice to have a PH-tree in ELKI, to allow others to experiment with it. We want ELKI to become a comprehensive tool; it has R-trees, M-trees, k-d-trees, cover-trees, LSH, iDistance, inverted lists, space-filling-curves, PINN, ...; there are working-but-not-cleaned-up implementations of X-tree, rank-cover-trees, bond, and some more.
We want to enable researchers to study which index works best for their data easily, and of course it would be nice to have PH-tree, too. We also try to push the limits of these indexes, e.g. when supporting other distance measures than Euclidean distance.
The effort depends on how experienced you are with coding; ELKI uses some well-optimized data structures, but that means we are not using standard Java APIs in a number of places because of performance. Adding the cover tree took me about one day of work, for example (and it performed really nicely). I'd assume a more flexible (but also more memory intensive) k-d-tree would be a similar amount of work. I have not studied the PH-tree in detail, but I'd assume it is slightly more effort than that.
My guts also say that it won't be as fast as advertised. It appears to be a prefix-compressed quadtree. In my experiments bit-interleaving approaches such as required for Hilbert curves can be surprisingly expensive. It also probably only works for Minkowski metrics. But you are welcome to prove me wrong. ;-)
You are always welcome to as for help at the mailing list, or here.
I would do an in-memory variant first, to fully understand the index. Then benchmark it to identify optimization potential, and debug it. Until then, you may not have figured out all the corner cases, such as duplicate points handling, degenerated data sets etc.
Always make on-disk optional. If your data fits into memory, a memory-only implementation will be substantially faster than any on-disk version.
When contributing to ELKI, please:
avoid external dependencies. We've had bad experience with the quality of e.g. Apache Commons, and we want to have the package easy to install and maintain, so we want to keep the .jar dependencies to a minimum (also, having tons of jars with redundant functionality comes at a performance cost). I'm inclined to only accept external dependencies for optional extension modules.
do not copy code from other sources. ELKI is AGPL-3 licensed, and any contribution to ELKI itself should be AGPL-3 licensed, too. In some cases it may be possible to include e.g. public domain code, but we need to keep these to a minimum. We could probably use Apache licensed code (in an external library), but shouldn't mix them. So from a quick look, you are not allowed to copy their source code into ELKI.
If you are looking for data mining project ideas, here is a list of articles/algorithms that we would love to see contributed to ELKI (we keep this list up to date for student implementation projects):
http://elki.dbs.ifi.lmu.de/wiki/ProjectIdeas
Given that it's very hard to find anything about dependency version ranges in the official documentation (the best I could come up with is http://docs.codehaus.org/display/MAVEN/Dependency+Mediation+and+Conflict+Resolution), I wonder if they're still considered a 1st class citizen of Maven POMs.
I think most people would agree that they're a bad practice anyway, but I wonder why it's so hard to find anything official about it.
They are not deprecated in the formal sense that they will be removed in a future version. However, their limitations (and the subsequent lack of wide adoption), mean that they are not as useful as originally intended, and also that they are unlikely to get improvements without a significant re-think.
This is why the documentation is only in the form of the design doc - they exist, but important use cases were never finished to the point where I'd recommend generally using them.
If you have a use case that works currently, and can accommodate the limitations, you can expect them to continue to work for the forseeable future, but there is little beyond that in the works.
I don't know why you think that version ranges are not documented. There is a concrete abstract in the Maven Complete Reference documentation.
Nevertheless - a huge problem (in my opinion) is that it is documented that "Resolution of dependency ranges should not resolve to a snapshot (development version) unless it is included as an explicit boundary." (the link you provided) but the system behaves different. If you use version ranges you will get SNAPSHOT versions if they exists in your range (MNG-3092). The discussion if this is wanted or not has not ended yet.
Currently - if you use version ranges - you might get SNAPSHOT dependencies. So you really have to be careful and decide if this is wanted. It might be useful for your own developed depedencies but I doubt that you should use it for 3rd party libraries.
Version ranges are the only reason that Maven is still useful. Even considering not using them is bad practice as it leads you into the disaster of multi-module builds, non-functional parent poms, builds that take 10 minutes or longer, badly structured projects like Spring, Hibernate and Wicket as we cover on our Illegal Argument podcast.
To answer your question, they are not deprecated and are actively used in many projects successfully (except when Sonatype allows corrupt metadata into Apache Maven Central).
If you want a really good example of a non-multi-module build (reactor.xml's only) where version ranges are used extensively, go look at Sticky code (http://code.google.com/p/stickycode/)
Time and again, I've seen people here and everywhere else advocating avoidance of nonportable extensions to the SQL language, this being the latest example. I recall only one article stating what I'm about to say, and I don't have that link anymore.
Have you actually benefited from writing portable SQL and dismissing your dialect's proprietary tools/syntax?
I've never seen a case of someone taking pains to build a complex application on mysql and then saying You know what would be just peachy? Let's switch to (PostGreSQL|Oracle|SQL Server)!
Common libraries in -say- PHP do abstract the intricacies of SQL, but at what cost? You end up unable to use efficient constructs and functions, for a presumed glimmer of portability you most likely will never use. This sounds like textbook YAGNI to me.
EDIT: Maybe the example I mentioned is too snarky, but I think the point remains: if you are planning a move from one DBMS to another, you are likely redesigning the app anyway, or you wouldn't be doing it at all.
Software vendors who deal with large enterprises may have no choice (indeed that's my world) - their customers may have policies of using only one database vendor's products. To miss out on major customers is commercially difficult.
When you work within an enterprise you may be able to benefit from the knowledge of the platform.
Generally speaking the DB layer should be well encapsulated, so even if you had to port to a new database the change should not be pervasive. I think it's reasonable to take a YAGNI approach to porting unless you have a specific requriement for immediate multi-vendor support. Make it work with your current target database, but structure the code carefully to enable future portability.
The problem with extensions is that you need to update them when you're updating the database system itself. Developers often think their code will last forever but most code will need to be rewritten within 5 to 10 years. Databases tend to survive longer than most applications since administrators are smart enough to not fix things that aren't broken so they often don't upgrade their systems with every new version.Still, it's a real pain when you upgrade your database to a newer version yet the extensions aren't compatible with that one and thus won't work. It makes the upgrade much more complex and demands more code to be rewritten.When you pick a database system, you're often stuck with that decision for years.When you pick a database and a few extensions, you're stuck with that decision for much, much longer!
The only case where I can see it necessary is when you are creating software the client will buy and use on their own systems. By far the majority of programming does not fall into this category. To refuse to use vendor specific code is to ensure that you have a porrly performing database as the vendor specific code is usually written to improve the performance of certain tasks over ANSII Standard SQL and it written to take advatage of the specific architecture of that database. I've worked with datbases for over 30 years and never yet have I seen a company change their backend database without a complete application rewrite as well. Avoiding vendor-specific code in this case means that you are harming your performance for no reason whatsoever most of the time.
I have also used a lot of different commercial products with database backends through the years. Without exception, every one of them was written to support multiple backends and, without exception, every one of them was a miserable, slow dog of a program to actually use on a daily basis.
In the vast majority of applications I would wager there is little to no benefit and even a negative effect of trying to write portable sql; however, in some cases there is a real use case. Let's assume you are building a Time Tracking Web Application. And you'd like to offer a self hosted solution.
In this case your clients will need to have a DB Server. You have some options here. You could force them into using a specific version which could limit your client base. If you can support multiple DBMS then you have a wider potential client that can use your web application.
If you're corporate, then you use the platform you are given
If you're a vendor, you have to plan for multiple platforms
Longevity for corporate:
You'll probably rewrite the client code before you migrate DBMS
The DBMS will probably outlive your client code (Java or c# against '80 mainframe)
Remember:
SQL within a platform is usually backward compatible, but client libraries are not. You are forced to migrate if the OS can not support an old library, or security environment, or driver architecture, or 16 bit library etc
So, assume you had an app on SQL Server 6.5. It still runs with a few tweaks on SQL Server 2008. I bet you're not using the sane client code...
There are always some benefits and some costs to using the "lowest common denominator" dialect of a language in order to safeguard portability. I think the dangers of lock-in to a particular DBMS are low, when compared to the similar dangers for programming languges, object and function libraries, report writers, and the like.
Here's what I would recommend as the primary way of safeguarding future portability. Make a logical model of the schema that includes tables, columns, constraints and domains. Make this as DBMS independent as you can, within the context of SQL databases. About the only thing that will be dialect dependent is the datatype and size for a few domains. Some older dialects lack domain support, but you should make your logical model in terms of domains anyway. The fact that two columns are drawn from the same domain, and don't just share a common datatype and size, is of crucial importance in logical modelling.
If you don't understand the distinction between logical modeling and physical modeling, learn it.
Make as much of the index structure portable as you can. While each DBMS has its own special index features, the relationship between indexes, tables, and columns is just about DBMS independent.
In terms of CRUD SQL processing within the application, use DBMS specific constructs whenever necessary, but try to keep them documented. As an example, I don't hesitate to use Oracle's "CONNECT BY" construct whenever I think it will do me some good. If your logical modeling has been DBMS independent, much of your CRUD SQL will also be DBMS independent even without much effort on your part.
When it comes time to move, expect some obstacles, but expect to overcome them in a systematic way.
(The word "you" in the above is to whom it may concern, and not to the OP in particular.)