Saturday, 16 September 2017

Migrating (and Open-Sourcing) an Historical Codebase: SVN-to-Git

I have a SVN repo on my local machine that I have been shoving stuff into since before I knew how to use revision control systems properly (I still don't). Each of the directories in my repo represents a separate project. Some of my directories have a proper trunk/branches/tags structure, but most don't. I want to migrate each of my SVN repo's directories into it's own new respective git repo, retaining all historical commits.

[My dodgy SVN repo]

The best tool I have found to do this so-far is svn2git:

This tool is a ruby gem that essentially wraps the native git feature for managing svn-to-git migration - git-svn. If you use svn2git's "--verbose" flag, you can see what commands it is issuing to the wrapped tool.

Migrating a dodgy SVN repo to multiple git repos

The reason I decided to blog about my experience was because I had a bit of trouble with the documentation and a bug with the svn2git tool. Mostly the tool is great, but for my particular purpose, I needed the "--rootistrunk" flag, which as it turns out, doesn't work. Instead (as the following Stack Overflow issue reports -, you can work around this by using the "--trunk", "--nobranches" and "--notags" flags.

[Excerpt from StackOverflow]

So, with reference to the images I have provided (above) of my haphazard SVN repo/directory structure, the following command (which I used Bash on Ubuntu on Windows to run) worked fine in the end (although it certainly took me a bit of messing about to get to this point):

$ svn2git file:///mnt/d/Work/SVN/SimpleList.V2 --verbose --trunk / --nobranches --notags --no-minimize-url

[Git log following successful application of svn2git]

Managing sensitive data

The following article provides some information on how to remove checkins of historical sensitive data (such as passwords) from your (new or old) git repo -

This is especially useful if you want to open-source an historical codebase. I used the "BFG Repo-Cleaner" tool, which is written in Scala and run using Java. Exceptionally useful and highly effective. After I had downloaded the tool, I used the following command on my new repo:

$ java -jar ../../tools/BFG/bfg.jar --replace-text sensitive.txt

The following image shows what your file containing sensitive data's historical commits will look like on GitHub once you have run across it with BFG. 

[After using BFG tool]

Git's native functionality for this purpose only allows you to get down to the file level (delete files), whereas BFG enables you to search/replace specific text in files across the entire history of your git repo, which is fantastic.

Mapping SVN authors to git authors

I think you can avoid having the Author come up ad "bernard@87bffff6-b7f0-bf49-a188-06524d5e88c0" (as shown in the example above) by specifying a mapping from your SVN authors to your git authors. You can extract that information using an approach described in the following link; a(nother) helpful blog post - - by using this relatively gnarly command:

$ svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

 Once you have this file, using can use the svn2git "--authors" option to specify the file that you've specified mappings in (would be "authors-transform.txt" if you used the above command to get the information). Clearly I forgot to apply this step before kicking off the migration for my SimpleList.V2 codebase. Bugger. If like me you forget to do this, nevermind, you can use the following approach to modifying the author of historical commits (if you can be bothered) -

Get the new git repo on to GitHub

Anyway, next step is to get my new git repo hosted on GitHub. Best advise I can offer here is to pick up section (4) of Troy Hunt's useful and succinct blog post on the same subject (his case is different from mine until this point in that he has a properly formed SVN repo):

Essentially once you have your git repo, complete with history brought across from SVN, you do the following to get your code onto GitHub:

  1. Create a new GitHub repo to push your code into.
  2. Add your remote repo as your new (migrated) repo's remote (GitHub provides good documentation for managing this process - - like this:
  3. Push to GitHub.
Note that if you have elected to add a ReadMe and/or a .gitgnore file on GitHub, that will need to be pulled and merged with your local repo before you can push it all  to GitHub. 

You're done!

I're retrospectively changed this blog post to make it into what are hopefully independently applicable sections. I've built this procedure up over a little while; it's essentially a piecemeal template for open-sourcing historical codebases on to GitHub (or any public git repo) using a few well-established tools. Have fun!

Tuesday, 5 September 2017

Should architects write code?

Should architects write code? Much conventional wisdom I'm aware of would suggest that architecture is different from programming, but that architects should definitely write code. For example:

Ward Cunningham's (an original signatory of the Agile Manifesto) wiki: "Architects Dont Code" (as-an Anti Pattern) -

Simon Brown's blog: "Most software developers are not architects" -

I did Toastmasters with an interesting character a few years ago. This person had recently left a very successful career in sports management in the USA; they decided they wanted a change in direction. They had completed an MBA with a reputable university and picked up an Enterprise Architect role (responsible for IT integration) with a large public-sector organisation. This person - who I really liked - had no background whatsoever in programming (or IT for that matter). My friend is now with a global IT consultancy, is called a "Business Architect" (as opposed to an EA) and has developed a successful international career.

[For lack of a more interesting picture - here is my skateboard]

Is the above is an example of an architect who doesn't/didn't code? Perhaps. I think that the person in my example at least is better suited to (and probably more comfortable with) Business Architecture, in any case.

Wednesday, 17 May 2017

The art of the professional apology

In my line of business it's often assumed that it's unwise to apologise. The same probably goes for most businesses. To apologise is to admit liability for at least part of a mistake that has been made. As a provider of a professional service, to apolgise is to reveal a vulnerability in the expertise that your customers have come to you for.

Yet a sincere, unlaboured apology can be an important part of the ongoing development of a business relationship. It can also test a relationship. Is the receiving party going to take advantage of the situation and press for discount, raising a volunteered admission of mistake as justification? And in that case, is this the type of customer you really want to be dealing with (if you have a choice)?

[Topless, laboured, insincere apology -]

A sincere apology usually takes courage. It can in-fact be viewed as a sign of good judgement. An apology is often acknowledgement that you wish to surface an issue for discussion and seek to improve performance as a key outcome - and can signal that you wish to move on quickly and efficiently. Isn't that a desirable and valuable trait in a professional?

On the other hand, an unscrupulous apology can simply be a way to get gullible/naive people off your back when you are in the process of trying to swindle them. For this reason, careful attention to any apology volunteered in a business situation is prudent.

As a provider of a specialised service, it's of course also important for a customer to have confidence in your ability to deliver a quality outcome. In that regard, an apology shouldn't be volunteered frivolously. It takes practice to judge when an apology should be offered, certainly.

[Cheesy apology -]

The art of the professional apology is not easy to master. In my opinion, to steer clear of using it altogether though can stunt the potential for development of deep mutual understanding - and productivity - in a business relationship. Like any powerful tool, the apology should be handled with care - with practice though, and used judiciously, it can enhance outcomes and enrich success.

Migrating (and Open-Sourcing) an Historical Codebase: SVN-to-Git

I have a SVN repo on my local machine that I have been shoving stuff into since before I knew how to use revision control systems properly (...