CVS to git

The most important features of git that I needed:

  • Support of file rename
  • Local history
  • Fast update of local repository
  • Awesome merge and rebase features

Our CVS repository history contained more than 20,000 commits made during more than 7 years.
Converting the repository I wanted to preserve all the history. Because I check the history quite often and I really need that.
To convert our CVS repository into git I tried cvs2git first.
But that utility created a lot of "fix-up" commits that were not perfect for preserving the history as exact as possible.
Thus I started experiments with git cvsimport.
That uses cvsps utility output (the output is CVS history split into patchsets) to create
git commits. git cvsimport accepts regexp of merge commit message. That was very helpful for preserving exact history.

I did the following:

  1. Created a local copy of our CVS repository
  2. Fixed found issues with that (redundant 1.1.1.1 revisions, redudant branches and tags, some others... There was a lot of junk.)
  3. Ran cvsps against that cleaned repository and obtained initial list of patches
  4. Fixed the list of patchsets to be as close as possible to the actual sequence of commits. That was required because of "CVS-surgery"
    we did in the past.
  5. Exported all revisions of all files from the repository. Because we used cvsnt and exporting all the files with cvsnt was easier
    than struggling with <CR><LF> and handling them in the same way as cvsnt.
  6. Modified perl script git-cvsimport to use files already exported from CVS repository
  7. Converted the CVS repostitory into git and ensured all tags and branches heads were the same as in the original CVS repository
  8. Wrote a couple of bat files to easily retrieve new revisions of files from remote CVS repository
  9. Wrote a bash script to append the new retrieved revisions into the git repository
Some lessons from this conversion:
  • I had never thought that there was so much junk in our CVS repository
  • cvsnt doesn't work under WINE
  • git-cvsimport under cygwin works hundreds times slower than under linux.
I ran cvsnt and bat files that retrieve recent revisions from remote CVS repository under Windows. cvsps was ran under cygwin
because speed was acceptable and I wanted to parse CVS logs immediately after retrieving. git-cvsimport was ran under Linux because of
great speed.