Can't svn2git public repos - git-svn

I need to migrate a customer from SVN to Git, so I wanted first to try svn2git on a public SVN repository.
I have found several public repos, e.g., https://svn.alfresco.com/repos/alfresco-open-mirror/alfresco and http://svn.apache.org/repos/asf/spamassassin. There is no problem of doing svn co, but when I try svn2git, I get the following problem:
D:\Documents\work\svn2git\apache>svn2git http://svn.apache.org/repos/asf/spamassassin
Initialized empty Git repository in D:/Documents/work/svn2git/apache/.git/
Using higher level of URL: http://svn.apache.org/repos/asf/spamassassin => http://svn.apache.org/repos/asf
W: Ignoring error from SVN, path probably does not exist: (160013): Filesystem has no item: REPORT request failed on '/repos/asf/!svn/bc/100': File not found: revision 100, path '/spamassassin'
W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
This may take a while on large repositories
Checked through r100
Checked through r200
Checked through r300
It ran the whole night, and ended with:
Checked through r22000
Checked through r22100
W: Ignoring error from SVN, path probably does not exist: (175002): RA layer request failed: PROPFIND request failed on '/repos/asf': PROPFIND of '/repos/asf': could not connect to server (http://svn.apache.org)
W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
This may take a while on large repositories
Checked through r477700
Path 'spamassassin' was probably deleted:
RA layer request failed: PROPFIND request failed on '/repos/asf': PROPFIND of '/repos/asf': could not connect to server (http://svn.apache.org)
Will attempt to follow revisions r477601 .. r477700 committed before the deletion
r477601 .. r477679 OK
Checked through r748600
Path 'spamassassin' was probably deleted:
RA layer request failed: PROPFIND request failed on '/repos/asf': PROPFIND of '/repos/asf': could not connect to server (http://svn.apache.org)
Will attempt to follow revisions r748501 .. r748600 committed before the deletion
Checked through r748700
Checked through r748800
Checked through r748900
Checked through r749000
Checked through r749100
Checked through r749200
Checked through r749300
Checked through r749400
W: Ignoring error from SVN, path probably does not exist: (175002): RA layer request failed: PROPFIND request failed on '/repos/asf/!svn/vcc/default': PROPFIND of '/repos/asf/!svn/vcc/default': could not connect to server (http://svn.apache.org)
W: Do not be alarmed at the above message git-svn is just searching aggressively for old history.
This may take a while on large repositories
Checked through r805700
Path 'spamassassin' was probably deleted:
RA layer request failed: PROPFIND request failed on '/repos/asf/!svn/vcc/default': PROPFIND of '/repos/asf/!svn/vcc/default': could not connect to server (http://svn.apache.org)
Will attempt to follow revisions r805601 .. r805700 committed before the deletion
Checked through r805800
Checked through r805900
Checked through r806000
Checked through r806100
Checked through r806200
Checked through r806300
Checked through r806400
Checked through r806500
Checked through r806600
Checked through r806700
command failed:
git checkout -f master
Why does it happen? Is it a permission problem?

Well, so far everything looks fine. If it fails, you should specify with what message it fails.
But besides that, for a one-time migration git-svn (the svn2git tool you use is based on git-svn) is not the right tool for conversions of repositories or parts of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn or tools based on it, but svn2git which is much more suited for this use-case.
There are plenty tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files. The svn2git that is based on git-svn suffers from most of the same drawbacks than pur git-svn, just working around of some of the issues.
You will be easily able to configure svn2gits rule file to produce the result you want from your current SVN layout, including any complex histories that might exist and including producing several Git repos out of one SVN repo or combining different SVN repos into one Git repo cleanly in one run if you like.
If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.
Even though git-svn is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:
the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
the tags are real tags and not branches in Git
with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
with svn2git you can also split one SVN repository into multiple Git repositories easily
or combine multiple SVN repositories in the same SVN root into one Git repository easily
the conversion is a gazillion times faster with the correct svn2git than with git-svn
You see, there are many reasons why git-svn is worse and the KDE svn2git is superior. :-)

Related

How to tag a git repo in a bamboo build

I'm trying to tag the git repo of a ruby gem in a Bamboo build. I thought doing something like this in ruby would do the job
`git tag v#{current_version}`
`git push --tags`
But the problem is that the repo does not have the origin. somehow Bamboo is getting rid of the origin
Any clue?
Yes, if you navigate to the job workspace, you will find that Bamboo does not do a straightforward git clone "under the hood", and the the remote is set to an internal file path.
Fortunately, Bamboo does store the original repository URL as ${bamboo.repository.git.repositoryUrl}, so all you need to do is set a remote pointing back at the original and push to there. This is what I've been using with both basic Git repositories and Stash, creating a tag based on the build number.
git tag -f -a ${bamboo.buildNumber} -m "${bamboo.planName} build number ${bamboo.buildNumber} passed automated acceptance testing." ${bamboo.planRepository.revision}
git remote add central ${bamboo.planRepository.repositoryUrl}
git push central ${bamboo.buildNumber}
git ls-remote --exit-code --tags central ${bamboo.buildNumber}
The final line is simply to cause the task to fail if the newly created tag cannot be read back.
EDIT: Do not be tempted to use the variable ${bamboo.repository.git.repositoryUrl}, as this will not necessarily point to the repo checked out in your job.
Also bear in mind that if you're checking out from multiple sources, ${bamboo.planRepository.repositoryUrl} points to the first repo in your "Source Code Checkout" task. The more specific URLs are referenced via:
${bamboo.planRepository.1.repositoryUrl}
${bamboo.planRepository.2.repositoryUrl}
...
and so on.
I know this is an old thread, however, I thought of adding this info.
From Bamboo version 6.7 onwards, it has the Git repository tagging feature Repository Tag.
You can add a repository tagging task to the job and the Bamboo variable as tag name.
You must have Bamboo-Bitbucket integrated via the application link.
It seems that after a checkout by the bamboo agent, the remote repository url for origin is set as file://nothing
[remote "origin"]
url = file://nothing
fetch = +refs/heads/*:refs/remotes/origin/*
That's why we can either update the url using git remote set-url or in my case I just created a new alias so it does not break the existing behavior. There must be a good reason why it is set this way.
[remote "build-origin"]
url = <remote url>
fetch = +refs/heads/*:refs/remotes/build-origin/*
I also noticed that using ${bamboo.planRepository.<position>.repositoryUrl} did not work for me since it was defined in my plan as https. Switching to ssh worked.

How can I recover after a checksum mismatch with 'git svn clone'?

I'm cloning an SVN repository to git as part of our migration plan. I've hit various snags along the way, forcing me to continue the clone with a git svn fetch command. The most recent failure I can't figure out how to solve:
$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t 8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
got: 4334926a01c905cdb7fce71265e370c1
I found this related answer, however that solution doesn't work because git svn log is not yet functional, as the repo is not fully in place:
$ git svn log dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions
log --no-color --first-parent --pretty=medium HEAD: command returned error: 128
How can I proceed?
Another answer to an old question but straight forward solutions are tough to find for this problem so hopefully this helps others.
I think this issue occurs due to a corrupted file during transfer. Not sure how or why it happens, but in my case, I get the same error at different revisions every time I do a new clone and sometimes not at all.
Using the questioners error message
$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t
8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
got: 4334926a01c905cdb7fce71265e370c1
The following steps allowed me to resume and progress :-
View all branches. These will all be remote branches. git branch -a
Checkout branch affected. git checkout remotes/origin/trunk-4632-jh
This will take some time to complete.
Find the last revision that the problematic file was changed. git svn log dc-smtpd/lib/Qpsmtpd/Address.pm.t
Note the highest revision #
Reset back to this rev. git svn reset -r (rev #) -p
Carry on. git svn fetch
Good luck.
I know this is old but maybe it will be helpful for future reference as all search results on this are not helpful.
I've hit similar issue on our huge repository which takes days to clone and unfortunately at one point I had to restart my machine. I am currently working out how to resolve the problem, so please keep in mind this is more a suggestion than tested solution.
I think you need to try creating a branch and checking out the commits you currently have from previous fetch:
git checkout -b master git-svn
After that is done you should have working tree up to that commit. Another fetches will probably fail due to object mismatch but at that point at least it should be possible to use "git svn reset" to revert faulty svn fetches (see OP's related answer link). If that's true find offending commit, reset before it and then continue fetching.
You might want to rebase and revert to state before that broken commit on your master branch or convert back to bare repository, if that's what you're after (in my case it is).
Hope this works. I'll post an update when my checkout is done (will take at least few hours... sigh).
Edit: That seemed to work. I successfully discarded some git-svn commits and am able to re-fetch them again. :)
Edit2: Make sure to reset until you don't get any object mismatch warnings on git svn fetch (otherwise you will run into the same issue soon).
Cheers,
Henryk
See also: Git svn rebase : checksum mismatch
In our case the additional treatment of the files (server-side includes in Apache) caused the checksum problem.
Disabling SSI in Apache's /etc/httpd.conf file for the period of migration by commenting out the
AddType text/html .shtml
AddOutputFilter INCLUDES .shtml
directives solved the problem, caused by the interpretation of .shtml files by the front-end Apache server, which produced a new content (and thus a new hash), other than the hash of the original file itself.
That means some files in the repository got corrupted. It can be caused by various reasons such as software bugs, bit rots in drives, etc. I was recently transitioning very old ~10GB svn repository to git, therefore some corruption was expected.
To fix the corruption, you basically need to dump the entire repository and import it while filtering the errors out. Note that our goal is to complete the import process no matter why or how the repository got corrupted. You cannot simply fix the corruption without having a backup and diffing through the revision files.
First basic one-off command you could use is:
svnadmin create repo2
svnadmin dump repo | sed '/^Text-content-md5/d' | svnadmin load repo2
This removes the checksum calculation from the dump so the new repo will have updated checksums.
If you encountered more errors during the dump and load (which is expected), try incremental approach so you can continue from the point you left. Below command will dump the revisions starting from 101 to 150 (inclusive).
svnadmin dump --incremental -r101:150 repo | sed '/^Text-content-md5/d' | svnadmin load repo2
Some common errors and solutions:
'Premature end of content data in dumpstream': That means Content-length of some file does not match the repository version, so some data is lost in the specified file. We must skip it. Add | svndumpfilter exclude path/to/file.jar command like this:
svnadmin dump --incremental -r101:150 repo | svndumpfilter exclude path/to/file.jar | sed '/^Text-content-md5/d' | svnadmin load repo2
Property errors: Add --bypass-prop-validation to svnadmin load command
After populating your second repo, you would simply svnserve -d -r repo2 and try git svn fetch again.
Good luck!

how to troubleshoot AOSP make failure src not found error

I have read some of the content on Android source, and my goal is to build an image for the Note III from Sprint. My make fails with the following message at the top: "find: `src': No such file or directory". I started a script session of my make process, so if there is any other information I can provide, please let me know. I tried to download jb-mr1-dev-plus-aosp branch for my device, and tried to build the full_toroplus-eng image. I think I have all the source from this branch. But I know very little about what I have. How do I validate my repo sync went smoothly? I get the following error at the end of my repo sync session:
9/platforms/android-12/arch-arm/usr/include/asm-generic/emergency-restart.h
9/platforms/android-12/arch-arm/usr/include/asm-generic/errno-
Aborting
Syncing work tree: 100% (348/348), done.
prebuilts/ndk/: discarding 93 commits
error: prebuilts/ndk/: platform/prebuilts/ndk checkout 9283a93c7b03896d32a8e88c9322c827d4303652
root#ubuntu:~/WORKING_DIRECTORY#
How do I find out more about this? How do I troubleshoot it?
I think I have all the source from this branch. But I know very little about what I have. How do I validate my repo sync went smoothly?
If Repo terminated with a zero exit code you're good to go.
I get the following error at the end of my repo sync session:
prebuilts/ndk/: discarding 93 commits
error: prebuilts/ndk/: platform/prebuilts/ndk checkout 9283a93c7b03896d32a8e88c9322c827d4303652
If memory serves me right, this means that the prebuilts/ndk directory is dirty (i.e. contains modified files) and thus prevents checking out a new commit. Running git status will tell you what's up.

Ignore packagist.org on composer install | update

I'm using composer internally for managing internal software dependencies. Our repository server is on our private network and we aren't using any other package from any other repository than ours.
Every time you run
composer.phar [install | update]
It checks on packagist.org repositories after check our own repository. Beyond unnecessary, it takes longer when packagist is slow (or even down) or our internet connection is having a bad day.
Is there any way to tell composer to ignore checking for packagist repositories?
Yes, and it is even documented on https://getcomposer.org/doc/05-repositories.md#disabling-packagist-org
You may try to use this command:
$ composer config repositories.packagist false
You probably want to have a look at Satis: http://getcomposer.org/doc/articles/handling-private-packages-with-satis.md
It will make your life easier if you deal with a bit more of local/private packages, because otherwise you'd have to mention EVERY repository that might host required code. And you can use Satis to grab a copy of the versions into a ZIP file that can be hosted locally as well. See http://www.naderman.de/slippy/src/?file=2012-11-22-You-Thought-Composer-Couldnt-Do-That.html#13 for some hints of how to do it (press cursor keys left/right to skip through the presentation)
For extra bonus points, you'd add packagist.org as a Composer repository to Satis, require some needed packages, and set { "require-dependencies": true } to grab their dependencies as well. In your own code, you'd only set your Satis repository and disable Packagist.

Is git svn dcommit atomic?

In my company we have a subversion server and everyone is using subversion on their machines.
However I'd like to use git, committing changes locally and then "push" them when I'm ready.
However, I can't understand what happens in the following situation.
Let's say that I made 3 git commits locally and now I'm ready to "push" everything on the subversion server. If I understand correctly, git svn dcommit should basically make 3 commits sequentially on the server, right? But what happens if in the meantime (let's say between the second and the third commit) another colleague of mine issues a commit?
The scenarios I can think of are:
1) git kind of "locks" (is that even possible?) the subversion server during commits so that my commits are doing atomically and my colleague's one is done after mine
2) The commit history on the server becomes mine1-mine2-other-mine3 (even if 'other' should fail since my colleague doesn't have an updated working copy at that point).
I think it's #2, but perhaps the committing speed is so high that this seldom becomes an issue. So which one is, #1 or #2?
No locks are not supported in Git, it's not a Git way (Git way is branching and merginig).
With git-svn you'll get mine1-mine2-other-mine3 history. If you need atomicity, have a look at SubGit project (it is installed into the SVN server and creates a pure Git interface for the SVN repository).
There was a similar question recently that might be interesting for you.
If you are lucky then number 2 but most of the time you aren't that lucky. In my experience when I dcommit a lot of commits and someone else commits while doing that usually 2 things happen:
It stops with dcommitting your other changes.
You lose the commits not-yet dcommitted.
Number 2 is really really annoying. The main problem is that you need to be totally up-to date to use git svn dcommit. This is because git-svn doesn't let the server merge revisions on the fly. (Because it would require both committers to have a working tree with both changes).
The only way to solve this are the following steps which I found here
Open .git/logs/HEAD
Look for your most recent commit (note that these commits are sorted
by “unix time”, although you can also find the latest one by reading
the shortlog there
Confirm that the commit you found is the right one: git show
git reset --hard hash from log
git svn rebase
git svn dcommit
Following this procedure allows you to take off from where it failed. I hope they fix this soon but they said this isn't priority for them yet.
Ofcourse if you commmit small groups and have a fast connection to the server it shouldn't happen that often. (I only got it 2-3 times when actively working and committing every day for 6 months).