What's the Difference Between Apache Jackrabbit and Jackrabbit Oak? - jcr

I'm sorry if this sounds stupid, but what's the difference between Jackrabbit and Oak? I'm looking into JCR170, specifically how to migrate content between two jackrabbit installations, and I've come across both JackRabbit and Oak.
The Apache Jackrabbit™ content repository is a fully conforming
implementation of the Content Repository for Java Technology API (JCR,
specified in JSR 170 and JSR 283).
A content repository is a hierarchical content store with support for
structured and unstructured content, full text search, versioning,
transactions, observation, and more.
Jackrabbit Oak is a complementary implementation of the JCR
specification. It is an effort to implement a scalable and performant
hierarchical content repository for use as the foundation of modern
world-class web sites and other demanding content applications. See
the Jackrabbit Oak website for more information.
Apache Jackrabbit is a project of the Apache Software Foundation
http://jackrabbit.apache.org/jcr/index.html
Their own home page says that Jackrabbit is a content repository implementing JCR and that Oak is a complementary implementation of JCR. Why is there two implementations of JCR by the same project?

As awd mentioned in the comment, Oak is just the latest version of Jackrabbit. It is not just an update, but rather a new implementation of the same JSR170. So the API does not change, but the underlying inner workings are a bit different. You can find a lot in the documentation as Julian mentions. Some of the major changes are:
Session handling: each session gets a snapshot of the repository when it was created to prevent concurrent changes on the repo: http://jackrabbit.apache.org/oak/docs/architecture/transactional-model.html
MicroKernels: with which you can define how the repository is stored. Currently as before with tar files called tarMK or the NoSQL DB MongoDB with the mongoMK.
Here is an overview of the changes: http://jackrabbit.apache.org/oak/docs/differences.html
And a short slideshow:
https://www.slideshare.net/jukka/oak-the-architecture-of-apache-jackrabbit-3

Currently also looking at the differences between them.
Apache OAK current does not support locking (https://jira.apache.org/jira/browse/OAK-6421) and merging therefore we might use jackrabbit instead.

Related

How do perform proper backups in Apache Sling

I am planning the backup strategy for my sling application. In my application users are able to register themselves and create their own content.
To be able to recover from a crash I tried to create a content package by using the composum package manager. This kind of backup works fine for the content but not for the users.
Any ideas how to backup my user-created users?
BR
Tim
From the Question, I comprehend that you have a JCR Repository as your content Repository and you have an apache sling middle ware which talks to JCR Repository.
Since Apache sling is a middleware which does not have any storage or users on its own I believe you are mentioning about Users in JCR.
Then you may try to follow this article in order to export or backup any data in XML. Content in JCR repository can be exported to xml.
https://jackrabbit.apache.org/archive/wiki/JCR/BackupAndMigration_115513344.html

Upload / Download BLOBs Jackrabbit

I'm looking for a way to provide upload/download mechanism for blob files.
These files should be stored in a jackrabbit datasource repository. I already discovered apache Sling framework for this task but it seems that this is not applicable for lare files (BLOBs) since it has it's own jackrabbit implementation and can not be used with my jackrabbit datasource repository. Do you have any ideas on how to solve this?
See http://markmail.org/message/rcmsahf4n3olostm where your on-list question was answered.

Apache Ivy: Where do I put all these JARs?

I'm trying to convince the higher-ups at my work place to migrate to Apache Ivy. I've managed to get a few sandbox projects working using Ivy to power the build, and now I have a greenlight to put together a migration proposal.
We all agree on one thing: we don't want to trust JARs that are located in public directories! I know, I know, a bit paranoid, yes. But we'd like to have a setup where we pull a JAR from a trusted source (either downloading it from the open source project itself, or most likely, gulp, a public repo), and use it for some time before we "certify" it (give it our blessing as a safe artifact to use).
Then we want to have a common repository for all JARs used by our many projects.
My original thinking was to place this repository up in version control (we have an SVN server). But I wasn't sure what best practices dictate. It might make more sense to put our JARs on a file server and FTP to them in the Ivy script.
Either way, SVN (HTTPS) or FTP, all of our servers are authenticated. So, a small number of questions:
Where should we be publishing all of our "certified" JARs (everything from `log4j` to any homegrown JARs we produce)? What do best practices dictate?
The "ivyrep" resolver-type does not take username or passwd atrributes. If our "JAR server" (FTP, SVN, etc.) is authenticated, how do I configure the Ivy scripts to login?
I must echo Brian's recommendation to use a repository manager like Nexus. It's a lot less work in the long run. You'll also discover that the professional version of Nexus enables you to create approval processes around repositories which you plan to use in your build. See the procurement suite functionality.
If, on the other hand, you are determined to build your own repository, then ivy has the tools for the job. You need to become very familiar with the ivy settings file and how it declares and uses resolvers.
If repository is accessible via HTTPS the the url resolver should be able to access it. The resolver will assume that each version of an artifact is in a different directory and you'll need to specify the URL pattern that ivy will need to use when accessing the repository:
<url name="two-patterns-example">
<ivy pattern="http://ivyrep.mycompany.com/[module]/[revision]/ivy-[revision].xml" />
<artifact pattern="http://ivyrep.mycompany.com/[module]/[revision]/[artifact]-[revision].[ext]" />
</url>
The pattern is fully flexible to how you store the artifacts.
Authentication is also handled in the settings file using the credentials tag.
Finally, the FTP protocol is also supported. It's hard to find in the doco, but it's supported by the vfs resolver.
I think that's enough information on an option I don't recommend :-) Having said that I once created an FTP based repository for managing releases to clients. It's useful to have a tool this powerful :-)
Why not use something like Sonatype's Nexus. I've seen it used for Maven, and I believe it'll work for Ivy.
You can set it up to download from remote repositories into (say) a 'test' repository. You can then evaluate those .jars, and if they're good, upload them into an 'approved' repository for general consumption. There's some authentication surrounding this, but you'd have to evaluate that in greater depth. Certainly you can restrict the uploading into repositories via a username/password pair.

How do you backup an apache Jackrabbit repository without shutting Jackrabbit down?

When running Apache Jackrabbit JCR as an embedded service in your app, is there a quick way to get a sound and consistent backup of the contents of the Jackrabbit repository without shutting Jackrabbit down? If so how?
See BackupAndMigration on the Jackrabbit Wiki for a list of options.
I would recommend to use XML export (system view), as it is the simplest solution. Also, because it is part of the JCR standard, so it should work on other JCR implementations as well.
Note that this approach has one drawback: it is currently not possible to re-import a full export, ie. from the root node and including the jcr:system subnode that contains the version storage, since the jcr:system part and especially the version storage are not writeable (this is mainly because JCR does not specify how to import versions). Here is some explanation on the Jackrabbit mailing list.

Migration To Trac

We are managing our development with Subversion over HTTPS, Bugzilla, and Mediawiki. Some of our developers have expressed an interest in migrating to Trac, so I have to evaluate what the cost of doing so would be.
For both the wiki and bugzilla, we would need to either migrate the existing data into Trac or a way to integrate with trac. Having two apps to create wiki pages or log bugs would not be acceptable. Also, currently each of these applications requires a separate sign on so we would need to map each of these accounts into Trac.
So know of any easy methods of importing or integrating these systems with Trac and/or a tutorial for doing so?
For Bugzilla, Trac has a script bugzilla2trac.py that will automate the process of importing Bugzilla bugs to Trac tickets for you. Of course, Trac doesn't have support for blocking/blockedby tickets out of the box, so if you want to import this data too, you'll have to use the MasterTicketsPlugin and then modify the script yourself (which is what we did when we migrated).
Wiki pages
If you could export your pages to text files you could import them using the Trac-Admin: http://trac.edgewall.org/wiki/TracAdmin wiki import command. Some formating clean-up migration might be in order
Tickets/Bugs
This script by Tom Lazar give you ability to synchronize the tickets system with a CSV file.
This gives you an ability to migrate from BugZilla: http://bitten.edgewall.org/wiki/TracImport
You could also check out this resource: http://trac.edgewall.org/wiki/TracSynchronize
For MediaWiki there exists a script as well: http://trac.edgewall.org/ticket/5241
It has some bugs, but imports all important information (pages, revisions, images, users). Together with the other mentioned script you should be able to migrate to Trac.
One thing that is not covered (yet) by the import script is the resolution of bugzilla links of the kind bug X or bug X comment Y.
One solution for this is to use the RegexLinkPlugin (http://trac-hacks.org/wiki/RegexLinkPlugin) with the following configuration in the trac.ini file:
[regexlink]
regex1=\bbug (?P<bug_id_comment>\d+) comment #(?P<commentid>\d+)\b
url1=http://your.trac.instance.com/ticket/\g<bug_id_comment>#comment:\g<commentid>
regex2=\bbug (?P<bug_id>\d+)\b
url2=http://your.trac.instance.com/ticket/\g<bug_id>