git svn clone without cloning tags - git-svn

I would like to skip import of all tags. is this the correct syntax?
git svn clone "http://svn/svn/IT_Udvikling" git-DataLicense --revision 37000:HEAD --trunk="/FID/DataLicense/trunk" --branches="/FID/DataLicense/branches" --no-minimize-url --authors-file=../authors-transform.txt
I am trying to skip import of tags because the git svn clone process takes forever. it has been running for 3 days now.

The best way to speed up git-svn for a one-time migration is to not use git-svn at all. Skipping the tags will probably not save you much time, as there are almost no commits that are unique to the tags in a typical SVN repository, just one per tag usually. But yes, if you don't specify -t or --tags directly or indirectly (e. g. by using -s or --stdlayout, then no tags should be fetched. But as I said, it will not speed up the process much.
For a one-time migration git-svn is not the right tool for conversions of repositories or parts of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.
There are plenty tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.
You will be easily able to configure svn2gits rule file to produce the result you want from your current SVN layout, including any complex histories that might exist.
If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.
Even though git-svn is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:
the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
the tags are real tags and not branches in Git
with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
with svn2git you can also split one SVN repository into multiple Git repositories easily
or combine multiple SVN repositories in the same SVN root into one Git repository easily
the conversion is a gazillion times faster with the correct svn2git than with git-svn
You see, there are many reasons why git-svn is worse and the KDE svn2git is superior. :-)

Related

I'd like to move over a branch from an svn location and use it as the master in the github location

I'd like to move over a branch from an svn location and use it as the master in the github location. Can anyone tell how to do this?
You can follow this process by Tiago Rodrigues (trodrigues)
If you want to clone an svn repository with git-svn but don't want it to push all the existing branches, here's what you should do.
Clone with git-svn using the -T parameter to define your trunk path inside the svnrepo, at the same time instructing it to clone only the trunk:
git svn clone -T trunk http://example.com/PROJECT
If instead of cloning trunk you just want to clone a certain branch, do the same thing but change the path given to -T:
git svn clone -T branches/somefeature http://example.com/PROJECT
This way, git svn will think that branch is the trunk and generate the following config on your .git/config file:
[svn-remote "svn"]
url = https://example.com/
fetch = PROJECT/branches/somefeature:refs/remotes/trunk
If at any point after this you want to checkout additional branches, you first need to add it on your configuration file:
[svn-remote "svn"]
url = https://example.com/
fetch = PROJECT/branches/somefeature:refs/remotes/trunk
branches = PROJECT/branches/{anotherfeature}:refs/remotes/*
The branches config always needs a glob. In this case, we're just specifying just one branch but we could specify more, comma separating them, or all with a *.
After this, issue the following command:
git svn fetch
Sit back. It's gonna take a while, and on large repos it might even fail. Sometimes just hitting CTRL+C and starting over solves it. Some dark magic here.
After this, if you issue a git branch -r you can see your remote branch definitions:
git branch -r
anotherfeature
From there you can define a master branch, and push it to a GitHub repo:
git checkout -b master anotherfeature
git remote add origin https://github.com/user/arepo.git
git push -u origin master
If you insist on using git-svn, VonC already provided a good answer.
But for a one-time migration git-svn is not the right tool for conversions of repositories or parts of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.
There are plenty tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.
You will be easily able to configure svn2gits rule file to produce the result you want from your current SVN layout, including any complex histories like yours that might exist and including producing several Git repos out of one SVN repo or combining different SVN repos into one Git repo cleanly in one run if you like.
If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.
Even though git-svn or the nirvdrum svn2git is easier to start with, here are some further reasons why using the KDE svn2git instead of git-svn is superior, besides its flexibility:
the history is rebuilt much better and cleaner by svn2git (if the correct one is used), this is especially the case for more complex histories with branches and merges and so on
the tags are real tags and not branches in Git
with git-svn the tags contain an extra empty commit which also makes them not part of the branches, so a normal fetch will not get them until you give --tags to the command as by default only tags pointing to fetched branches are fetched also. With the proper svn2git tags are where they belong
if you changed layout in SVN you can easily configure this with svn2git, with git-svn you will loose history eventually
with svn2git you can also split one SVN repository into multiple Git repositories easily
or combine multiple SVN repositories in the same SVN root into one Git repository easily
the conversion is a gazillion times faster with the correct svn2git than with git-svn
You see, there are many reasons why git-svn is worse and the KDE svn2git is superior. :-)

Getting error while migrating code from svn to git repository: Malformed network data: The XML response contains invalid XML: svn2git

Ran command git svn clone "SVN URL".
It works fine till 4568 commits, but then it gets after this commit giving the error stated in title.
this seems to be because the default log-window-size is too small.
When you get error, from the new git repo, try running:
git svn fetch --log-window-size=4000
You can experiment with the actual number, but 4000 was the magic number for me.
git-svn is not the right tool for one-time conversions of repositories. It is a great tool if you want to use Git as frontend for an existing SVN server, but for one-time conversions you should not use git-svn, but svn2git which is much more suited for this use-case.
There are pleny tools called svn2git, the probably best one is the KDE one from https://github.com/svn-all-fast-export/svn2git. I strongly recommend using that svn2git tool. It is the best I know available out there and it is very flexible in what you can do with its rules files.
If you are not 100% about the history of your repository, svneverever from http://blog.hartwork.org/?p=763 is a great tool to investigate the history of an SVN repository when migrating it to Git.

Use git-svn with multiple svn repos and branches

I have a svn branch that I had been working on and decided to start using git-svn to work locally. Now I have two problems. I want to move my work into another svn repository (on the same host) but I'd first like to merge the latest work from trunk. How would I do this with git-svn? Also, how would I continue my work in a separate svn-repo while continually merging work from the original repo? Also, I don't want to checkout the entire history from the original trunk because the project is rather huge. I am new to git and to git-svn, though I've taken a crash course in git branching and I feel confident enough to use advanced commands like rebase and cherry-pick. I mainly need to know how to apply these concepts thru git-svn. Do the svn repos get setup as a git remote somehow? Are there good resources on the net explaining how it works? Any guidance is much appreciated.
Create your Git repo with git svn init -s <url>.
git config --edit, add several svn-remotes for each of your Subversion repos. Later you'll use the -R option to all git-svn commands to select which svn-remote to use.
Tweak svn-remote branch mappings as needed. Keep in mind that the default refs/remotes/* namespace specifies remote branches — not Git remotes. (You'll have just a single git remote named . which I don't recommend pushing/pulling to/from).
You can easily design your remote branches namespace to keep branches from different Subversion repos separated (e.g. refs/remotes/repoA/*, /refs/remotes/repoB/* etc).
git svn fetch. This has options to scan history only partially, e.g. starting from a specific revision. Please read the manpage on instructions how to do this.
You can also ignore specific paths and/or branches here.
Work with Git as usual, trying to keep your commits as linear as possible. Rebase often. Merge commits are fine (git-svn will even set svn:mergeinfo property), but holy cow be careful (and read the manpage for caveats). Understand that Git commits with git-svn-id tags are immutable, and push -f won't save you. For example, it's forbidden to amend or rebase already dcommit'ed changes.
Are there good resources on the net explaining how it works?
By far the best resource is the manpage. The next after it is git-svn source.

Working around unneeded subdirs with git-svn in order to save space

I've started using git-svn for an SVN-based project, so that I can make local commits.
However, the SVN repository contains a lot of directories that I don't need to work with. When I solely used SVN, I was able to partly check-out stuff with:
svn co <repos-url> --depth empty
and then update the needed directories:
svn up <repos-dir>/<subdir>
As far as I've understood, partly checking out a project isn't an option with Git, so I'm looking for alternative way of saving some space. Any suggestions?
Edit: what I am thinking myself is something in the lines of creating a branch thatonly contains the files I need. I'd then want to be able to push the changes to these files without pushing any removal of the files I don't need. But I am not too deply into the way Git works to figure out if this is possible?
Are the extra directories really that big? One advantage of Git is that you do most of your work from your local harddrive (you commit to your own branch, not to the server) so it's fast even when there are many files.

git svn as a crutch for svn merging woes

Consider an svn repository that has branches that are not necessarily located in the usual trunk/tags/branches layout. I want to persuade git-svn to take two of those branches on board, plus whatever else it needs, so that I can use git as a merge tool to avoid various levels of heck that have been plaguing us with svn merging. If the branches are all in one place, there's --branches, but is there a way if they are not?
Do your git svn init ROOT_URL, then go in and edit your .git/config to add additional fetchlines:
[svn-remote "svn"]
url = https://svn.mcs.anl.gov/repos/mpi
fetch = mpich2/trunk:refs/remotes/trunk
fetch = mpich2/branches/dev/threads:refs/remotes/threads
fetch = mpich2/branches/dev/knem:refs/remotes/knem
fetch = mpich2/branches/release/MPICH2_1_0_8:refs/remotes/mpich2-1.0.8
Then git fetch and you should get all of the branches explicitly listed in your config file.
Be aware, however, that you may not want to actually pursue merging externally via git-svn. Git won't maintain the svn:mergeinfo properties for you, which will make going back to a native SVN-based merge workflow nearly impossible. You might also confuse git-svn quite a bit by cherry-picking or merging code that has already been committed to the actual SVN repo, since it searches for git-svn-id: "breadcrumbs" in the commit messages in order to figure out which SVN path should be used for git svn dcommit. See the CAVEATS section of the git-svn man page for additional info about this.
FWIW, I've also posted about this in longer form on my own site.