Is it possible to migrate a Kettle file based repository (including history of all files) into a Kettle database repository?
Yes. Just open all the file (all together or one by one), and save them (all together or one by one) in the repository.
You will loose however all the versioning (maybe that is what you call file history), unless you have the enterprise edition.
If you are wit a the community version, I suggest you use the database repository for the current version, and keep a one a week or so, export it to a file repository.
Related
We need to migrate our VSTS team project. I already saw that this is an eagerly awaited feature from the Visual Studio user Voice.
However, in our case the new team project is to be in the same VSTS account. Is there a way to do this while also keeping version control change history? Keeping the change history available as part of the old team project is unfortunately not an option as we will lose access to the old team project soon after migration.
If somebody has done this before with the help of any of the below tools, then it would be great if they can share their experience:
VSTS copy project
VSTS sync migrator
OpsHub
It's a bit unclear what you're about to migrate from where. And why you'd lose access to the existing project. And you have different options based on the current source control type selected.
One option which you could try is to create 2 new accounts and leave the whole old account in read-only state. That should leave the history available to everyone. You can then spin up as many new accounts as you want, using just the latest version of the sources.
Git
If it's a Git repository it's as simple as making a local clone of the whole repo, creating a new team project in VSTS and pushing the clone into its second home.
TFVC
If it's TFVC, it's much harder. I've used OpsHub in the past which works reasonably well, but in our case completely got stuck in a couple of strange merge situations. Those were probably created as part of work done back when that team project was hosted in TFS 2008, so you may be luckier than we were.
You could decide to move to Git as part of your migration. Use git-tfs to create a local git repository with all your TFVC history and then push that into a bare Git repository in your new team project. Or use the TFVC import tool. There's quite a bit of documentation on this subject.
The VSTS Sync Migrator supports a snapshot without history as far as I can tell. Which would not suit you.
VSTS Copy Project doesn't support TFVC, and is no option in this case.
An option that's missing from your list is Timely Migration, it supports TFVC to TFVC migrations among other options. I've used them a long time ago to copy data between TFS servers. Back then they were working exactly as advertised.
I want to use Pentaho for my work. After a bit of research I found that to store the ktr/kjb files I can have either database as a repository or I can use file system as a repository. However I don't find any benefits of using database as a repository over file system. The basic purpose of repository here is to create a common location where I can keep all the developed ktr/kjb files in production environment. Basically if I consider the database repository, it will hold all the developed ktr/kjb files in production and every time I need to run a job/transformation I will connect to database to get the respective ktr/kjb file (similar to how informatica stores transformation) on the other hand file based repository will be like a folder holding all the developed files.
Can somebody here will be able to explain pros and cons of both type of repository?
Please let me know if you need any other information.
Thanks in advance.
When several people develop on the same jobs/transformations, the database repository will hold the changes, and ensure the latest versions.
The pros of a filesystem is of course ease of backup, no database connection that can trouble you, and the possibility to use other, more modern and mature version control systems for the files, than the database repositories use.
If you are using the free community edition, I would definitely go with the file repository, along with external file-based version control and migration systems. If you are using the enterprise edition, then you might want to consider the database repository, since you can then use Pentaho's built-in version control and migration systems.
I'm looking for a "repository" to store derived information (build artifacts).
We have a repository (currently Mercurial) to store our source code. When something is pushed to the source repository the code goes through a continuous integration server and we do an incremental build and as a result some dlls will be changed. This should be added to some "repository" so that everybody can use that version without needing to do the build again.
I'm looking for the following features:
It should be easy to update the source code and get the corresponding binaries (we could probably make a script for that)
You should easily get all binaries at once (not only those that changed during the last incremental build.
Binaries that weren't changed should only be stored once in the repository.
When updating the source code and the binaries only the changed binaries should be transferred (and not all binaries). This is similar to what happens for source code.
When updating to some version, only that version should be stored locally, not the complete history.
We should be able to remove certain versions from the binary "repository" after a while. However if the dlls are still necessary for subsequent incremental builds, these dlls should of course not be completely removed from the "repository"
What would fit these requirements?
I agree with Manfred, what you are looking for is a binary repository manager. Besides the Nexus repository manager you should consider Artifactory.
As for the feature list you asked about:
As you have mentioned the CI server should be responsible for identifying a change in the version control and starting a build process which creates the binaries. The CI server/build tool should also deploy the generated binaries to the repository manager, in case the build was successful. Artifactory offers a build integration feature which takes care of deploying the binaries together with the build metadata.
Using the build integration feature of Artifactory, you can get a list of all the binaries generated by a specific build and download them as an archive. Artifactory provides a REST API for those actions.
There are different approaches for storing the artifacts in a repository manager. Some tools stores a multiple copies of the same binary. Other, for example Artifactory, use a checksum based storage which keeps only one copy per binary (based on its checksum). This pays of if you keep multiple copies of the same binary in different repositories, especially if you are dealing with large binaries (war files, docker images, ISOs etc.). Another benefit are cheap copies/moves between repositories which is a common practice for promotion workflows.
The Artifactory build integration uses checksum based deployment which deploys only binaries which does not exist in Artifactory. For binaries which do exist and have not changed, it only created a new reference to the existing binary saving the need to send the actual bytes.
Artifactory provides multiple option of cleaning up binaries, including built in cleanup policies and the option to develop your own custom logic using user plugins and the Artifactory query language (AQL)
In addition, I highly recommend to take a look at the binary repository comparison matrix.
Disclaimer: I am working for JFrog the company behind Artifactory
You are basically asking for a repository manager like the Nexus Repository Manager as you have correctly identified with the tags.
In terms of specific requirement from your questions here are a couple of ideas.
binary components are typically identified via some coordinates that most of the time includes some sort of name and version. A release and build process changes those and deploys them to the repository. This allows you to match source code with binaries. You can also embed information like git refs in the produced binaries.
accessing the binaries is typically done via HTTP, so its easy. You then just have to determine what it means to get "all binaries".
not duplicating binaries that are essentially the same can be supported by the underlying file system or the build tool. I have seen both processes to work. Often it is however not worth the effort since storage is cheap.
there are various ways to automatically clean up repositories including scheduled tasks that do it regularly. Worst case you have to implement your own logic in an extension
Disclaimer: I work as community advocate and trainer for the Nexus Repository Manager with Sonatype.
Recently I started to incorporate good practices in my development workflow, so I split the development server and the production one. I also incorporated a versioning system using Subversion (Tortoise SVN).
Now I have the problem of synchronize the production server (Apache shared hosting) with the files of the last development version in my local machine.
Before I didn't have this problem because I worked directly with the server files through Filezilla. But now I don't know how to transfer the files in an efficient way and what are the good practices in this aspect.
I read something about Ant and Phing but I'm not sure if this appropiate to me or is unnecessary complexity.
Rsync is a cross-platform tool designed to help in situations like this; I've used it for similar purposes on multiple occasions. This DevShed tutorial may be of some help.
I don't think you want to "authomatize" it, rather establish control over your deployment and integration process. I generally like SVN but it has some bugs and one problem I have with it is that it doesn't support baselining -- instead you need to make a physical branch of your repository if you want to have a stable version to promote to higher environments while continuing to advance the trunk.
Anyway, you should look at continuous integration and Jenkins. This is a rather wide topic to which not a specific answer can be given. There are many ins, outs, what-have-yous. Depends on your application platform, components, do you have database changes, are you dealing with external web services or 3rd party APIs etc.
Maybe out there are more structured solutions but with Tortoise SVN you can export only the files changed between versions in a folder tree structure. And then, upload as always in Filezilla.
Take a look to:
http://verysimple.com/2007/09/06/using-tortoisesvn-to-export-only-newmodified-files/
Using TortoiseSVN, right-click on your working folder and select
“Show Log” from the TortoiseSVN menu.
Click the revision that was last published
Ctrl+Click the HEAD revision (or whatever revision you want to
release) so that both the old and the new revisions are
highlighted.
Right-click on either of the highlighted revisions and select
“Compare revisions.” This will open a dialog window that lists all
new/modified files.
Select all files from this list (Ctrl+a) then right-click on the
highlighted files and select “Export selection to…”
Side note:
You have to open more details about your workflow and configuration - applicable solutions depends from it. I see 4 main nodes in game: Workplace, Repo Server, DEV, PROD, some nodes may be united (1+2, 2+3), may have different set of tools (do you have SSH, Rsync, NFS, Subversion clients on DEV|PROD). All details matter
In any case - Subversion repositories have such thing, as hooks, in your case post-commit hook (executed on Repository Server side after each commit) may be used
If this hook (any code, which can be executed in unattended mode) you can define and implement any rules for performing deploy to any target under any conditions. You must only know
Which transport will be used for transferring files
What is your webspaces on servers (Working Copies of just clean unversioned files - both solution have pro and contra sets) - it will define, which deployment-policy ("export" or "update") you have to implement in hook
Some links to scripts, which export files, affected by revision (or range of revisions) into unversioned tree
I want to enable user customization at run time with Oracle ADF. With JDeveloper, if I deploy my application directly to the server, a window permits me to choose a repository previously registered on my Enterprise Manager, as depicted in following image:
In this way my application works great. Now I would to deploy to EAR file (in order to save a copy of my release) but I don't know how to target MDS repository. On the web I've found that maybe the adf-config.xml file has to be modify, but in which way?
have a look at this blog entry, which explains how to do it. Apparently this information needs to be provided upon EAR file deployment
http://andrejusb.blogspot.de/2011/05/target-mds-repository-for-adf.html