Saturday, 16 September 2017

Migrating (and Open-Sourcing) an Historical Codebase: SVN-to-Git

I have a SVN repo on my local machine that I have been shoving stuff into since before I knew how to use revision control systems properly (I still don't). Each of the directories in my repo represents a separate project. Some of my directories have a proper trunk/branches/tags structure, but most don't. I want to migrate each of my SVN repo's directories into it's own new respective git repo, retaining all historical commits.


[My dodgy SVN repo]

The best tool I have found to do this so-far is svn2git: https://github.com/nirvdrum/svn2git

This tool is a ruby gem that essentially wraps the native git feature for managing svn-to-git migration - git-svn. If you use svn2git's "--verbose" flag, you can see what commands it is issuing to the wrapped tool.

Migrating a dodgy SVN repo to multiple git repos


The reason I decided to blog about my experience was because I had a bit of trouble with the documentation and a bug with the svn2git tool. Mostly the tool is great, but for my particular purpose, I needed the "--rootistrunk" flag, which as it turns out, doesn't work. Instead (as the following Stack Overflow issue reports - https://github.com/nirvdrum/svn2git/issues/144), you can work around this by using the "--trunk", "--nobranches" and "--notags" flags.


[Excerpt from StackOverflow]

So, with reference to the images I have provided (above) of my haphazard SVN repo/directory structure, the following command (which I used Bash on Ubuntu on Windows to run) worked fine in the end (although it certainly took me a bit of messing about to get to this point):

$ svn2git file:///mnt/d/Work/SVN/SimpleList.V2 --verbose --trunk / --nobranches --notags --no-minimize-url


[Git log following successful application of svn2git]

Managing sensitive data


The following article provides some information on how to remove checkins of historical sensitive data (such as passwords) from your (new or old) git repo - https://help.github.com/articles/removing-sensitive-data-from-a-repository/

This is especially useful if you want to open-source an historical codebase. I used the "BFG Repo-Cleaner" tool, which is written in Scala and run using Java. Exceptionally useful and highly effective. After I had downloaded the tool, I used the following command on my new repo:

$ java -jar ../../tools/BFG/bfg.jar --replace-text sensitive.txt

The following image shows what your file containing sensitive data's historical commits will look like on GitHub once you have run across it with BFG. 

[After using BFG tool]

Git's native functionality for this purpose only allows you to get down to the file level (delete files), whereas BFG enables you to search/replace specific text in files across the entire history of your git repo, which is fantastic.

Mapping SVN authors to git authors


I think you can avoid having the Author come up ad "bernard@87bffff6-b7f0-bf49-a188-06524d5e88c0" (as shown in the example above) by specifying a mapping from your SVN authors to your git authors. You can extract that information using an approach described in the following link; a(nother) helpful blog post - https://john.albin.net/git/convert-subversion-to-git - by using this relatively gnarly command:

$ svn log -q | awk -F '|' '/^r/ {sub("^ ", "", $2); sub(" $", "", $2); print $2" = "$2" <"$2">"}' | sort -u > authors-transform.txt

 Once you have this file, using can use the svn2git "--authors" option to specify the file that you've specified mappings in (would be "authors-transform.txt" if you used the above command to get the information). Clearly I forgot to apply this step before kicking off the migration for my SimpleList.V2 codebase. Bugger. If like me you forget to do this, nevermind, you can use the following approach to modifying the author of historical commits (if you can be bothered) - https://help.github.com/articles/changing-author-info/.

Get the new git repo on to GitHub


Anyway, next step is to get my new git repo hosted on GitHub. Best advise I can offer here is to pick up section (4) of Troy Hunt's useful and succinct blog post on the same subject (his case is different from mine until this point in that he has a properly formed SVN repo): https://www.troyhunt.com/migrating-from-subversion-to-git-with/

Essentially once you have your git repo, complete with history brought across from SVN, you do the following to get your code onto GitHub:

  1. Create a new GitHub repo to push your code into.
  2. Add your remote repo as your new (migrated) repo's remote (GitHub provides good documentation for managing this process - https://help.github.com/articles/adding-a-remote/) - like this:
  3. Push to GitHub.
Note that if you have elected to add a ReadMe and/or a .gitgnore file on GitHub, that will need to be pulled and merged with your local repo before you can push it all  to GitHub. 

You're done!


I're retrospectively changed this blog post to make it into what are hopefully independently applicable sections. I've built this procedure up over a little while; it's essentially a piecemeal template for open-sourcing historical codebases on to GitHub (or any public git repo) using a few well-established tools. Have fun!


Tuesday, 5 September 2017

Should architects write code?

Should architects write code? Much conventional wisdom I'm aware of would suggest that architecture is different from programming, but that architects should definitely write code. For example:

Ward Cunningham's (an original signatory of the Agile Manifesto) wiki: "Architects Dont Code" (as-an Anti Pattern) - http://wiki.c2.com/?ArchitectsDontCode

Simon Brown's blog: "Most software developers are not architects" - http://www.codingthearchitecture.com/2014/02/21/most_software_developers_are_not_architects.html

I did Toastmasters with an interesting character a few years ago. This person had recently left a very successful career in sports management in the USA; they decided they wanted a change in direction. They had completed an MBA with a reputable university and picked up an Enterprise Architect role (responsible for IT integration) with a large public-sector organisation. This person - who I really liked - had no background whatsoever in programming (or IT for that matter). My friend is now with a global IT consultancy, is called a "Business Architect" (as opposed to an EA) and has developed a successful international career.

Is the above is an example of an architect who doesn't/didn't code? Perhaps. I think that the person in my example at least is better suited to (and probably more comfortable with) Business Architecture, in any case.


Wednesday, 17 May 2017

The art of the professional apology

In my line of business it's often assumed that it's unwise to apologise. The same probably goes for most businesses. To apologise is to admit liability for at least part of a mistake that has been made. As a provider of a professional service, to apolgise is to reveal a vulnerability in the expertise that your customers have come to you for.

Yet a sincere, unlaboured apology can be an important part of the ongoing development of a business relationship. It can also test a relationship. Is the receiving party going to take advantage of the situation and press for discount, raising a volunteered admission of mistake as justification? And in that case, is this the type of customer you really want to be dealing with (if you have a choice)?



[This joker doesn't look particularly sincere - https://medium.com/@laurenholliday_/how-to-write-a-damn-good-apology-8b554513f8eb]

A sincere apology usually takes courage. It can in-fact be viewed as a sign of good judgement. An apology is often acknowledgement that you wish to surface an issue for discussion and seek to improve performance as a key outcome - and can signal that you wish to move on quickly and efficiently. Isn't that a desirable and valuable trait in a professional?

On the other hand, an unscrupulous apology can simply be a way to get gullible/naive people off your back when you are in the process of trying to swindle them. For this reason, careful attention to any apology volunteered in a business situation is prudent.

As a provider of a specialised service, it's of course also important for a customer to have confidence in your ability to deliver a quality outcome. In that regard, an apology shouldn't be volunteered frivolously. It takes practice to judge when an apology should be offered, certainly.



[Learn how to apologise proper - https://www.parentmap.com/article/the-power-of-real-apologies-in-a-fake-apology-world]

The art of the professional apology is not easy to master. In my opinion, to steer clear of using it altogether though can stunt the potential for development of deep mutual understanding - and productivity - in a business relationship. Like any powerful tool, the apology should be handled with care - with practice though, and used judiciously, it can enhance customer outcomes and enrich business success.


Tuesday, 1 November 2016

Microsoft is Going GNUts

Discussion

I cut my teeth as a commercial software developer developing line-of-business systems for SMEs in with Microsoft Access in the early 2000s. Although I have subsequently enjoyed working with a range of tools and platforms (wouldn't quite call myself a "polyglot" yet), I generally take a keen interest in how Microsoft's software development surface-area emerges and evolves over time.

The past year-or-two (since Satya Nadella has taken the lead) has been fascinating - very exciting, from my geeky perspective. The Microsoft software development toolkit is basically gone/going open - right down to the nuts-and-bolts that is "Bash on Ubuntu on Windows". This attitude is an about-face - having gone from implicitly exclusive to explicitly inclusive.

Strategically this makes great sense, since for Microsoft it's really mostly about the cloud land-grab (pun intended) now; in effect Microsoft are saying - "Bring all of your systems and platforms to Azure - all shapes, sizes and flavours are welcome, the more-the-merrier!".



[The cloud services landgrab - http://cloudcomputing.sys-con.com/node/2192464]

There is a wider discussion here - this post more-or-less focuses on what Microsoft's paradigm shift means technically for web and application development though. It was written in January 2016 - it's taken me a while to get around to publishing it - so although I have checked the links and updated some terminology, some of the material may be outdated.

Microsoft going GNUts?

Microsoft are taking us in an interesting direction with the introduction of DNX (.NET Execution Environment - now .NET Core CLI), ASP.NET 5.0 (now ASP.NET Core 1.0) and MVC 6 (now Core MVC). These cross-platform tools are creating opportunities for developers and IT pros on Mac and Linux to explore options in “Microsoft land” and for .NET developers to host applications on non-Microsoft platforms.

The move toward cross-platform capability has been on the cards with Microsoft for a while, but it feels a bit like we're shifting gears recently.

For example – there's the beginnings of a Linux/Mac-friendly development environment in the mix (Visual Studio "Code"). The structure of an MVC 6 (ASP.NET 5.0) project is significantly different from how it was previously - for example, project.json replaces {project_name}.csproj - the JSON format being open and cross-platform ready. And MVC 6 uses DNX as it's execution environment, running by default on the Kestrel web server, which is based on entirely OSS.

The following few links provide an introduction to some of this terminology:
Aside from ASP.NET 5.0, we’re also seeing some interesting developments with cross-platform capability for mobile development in VS2015, including the ability to debug native Android applications using the GNU debugger (GDB) from RC1.

What about Mono?

Mono is an open-source clone of the .NET framework, that was enabled and more-or-less foretold by Microsoft making C# and the .NET CLI an ECMA and ISO standards (ECMA-334 and ISO/IEC 23270 respectively), way back in 2003.

Microsoft’s DNX (.NET Execution Environment) is essentially a host for the .NET framework that runs at OS (“native”) level and operates in a way similar to a VM, thus literally providing an execution environment for an instance of the .NET framework. That could be a plain-old Microsoft .NET framework instance, or a more exotic Mono .NET framework instance. DNX is cross-platform capable out-of-the-box though, meaning that you can now run Microsoft’s version of the .NET framework on Linux or Mac. The introduction of the DNX doesn’t make Mono redundant, because Mono is inherently cross-platform and can run directly at the OS level. However, it does herald a sea change in the way that we think about Microsoft applications development.

The following diagram provides a nice visualisation of how everything fits together with the introduction of DNX:


[“Layering” diagram – taken from https://github.com/aspnet/Home/wiki/DNX-structure]

These couple of links may also help explain how all of this fits together:

State-of-play

In summary, DNX is an emerging initiative from Microsoft (now emerged in the form of .NET Core CLI) that provides a cross-platform execution environment that is capable of hosting and enabling applications built on Mono and/or .NET 4.5+, across the Mac, Linux and Windows platforms.

To-date it seems that this emerging ecosystem is still a bit sketchy (at time of writing the current version of DNX was 1.0.0-rc1 - now DNX has morphed into .NET Core CLI and is in release version 1.0.1) – however, it also seems it is coming along in leaps-and-bounds. There seems to be a good deal of interest from the OSS community and there's lots of support in the form of blogs, articles, etc. Here are a few goodies:

Migration path

The process of migrating an ASP.NET 4 (Web API 2) project to MVC 6 on Code/Linux (Ubuntu) still seems to be riddled with dependency issues, so certainly not plain sailing just yet. But it is early days and the tooling is maturing rapidly. This opens up some interesting questions (and opportunities) for the future of Microsoft applications development.


Closing thoughts

The essence of software development is not brands, tools and platforms - it is craft, design and skillful execution. I feel like developers increasingly understand that nowadays, which is cool. Microsoft's new strategy lends itself to this, which is also cool.

Looking forward to seeing how Microsoft's increasingly open/inclusive strategy plays out over the remainder of 2016 and in to 2017.

Friday, 8 July 2016

Thoughts on official version(s) of history

It's a sign of a healthy society that the validity of an official version of history may be openly queried.

It's also healthy that an official version of history (of course everyone is entitled to maintain their own personal version) be managed by accepted academia (i.e. peer reviewed, by publishing in established independent media; journals, conferences, etc), who are constrained for the sake of their reputation to backing up statements they make with academically verifiable evidence.


The way to change history in this case is to invest the time required to do it through academic channels. That would usually involve getting a Ph.D and because of the rigor necessitated by this approach, is usual only feasible to do a bit at a time.

The other way to change history is by war - revolutionary change to an official version of history can be achieved this way. A version of history implemented by war works for the winner; for the group of people that they manage to win rule over and for as long as they choose to operate a dictatorship.

Perhaps all official histories started out by being imposed by war. As peace ensues though and as academia becomes free to explore, official history becomes decreasingly biased.

Friday, 28 August 2015

Using TestDriven.NET and NCover to benchmark and monitor unit test code coverage

I wrote this blog post about a year ago - have finally got around to putting it online. It uses a "genericised" system called UpdateManager to demonstrate how to set up code coverage analysis for .NET. Hopefully it's not too dated just yet...

What is unit test code coverage benchmarking/analysis?

Unit test code coverage analysis essentially tells us "how much of my codebase is covered (or "exercised") by my unit test suite?"
When we are writing unit tests, it is necessary to visually scan the codebase and get a feel for what degree of coverage we have and what parts of the codebase are important/relevant for unit test.
Setting up a benchmark such as "a standard .NET codebase must have at least 70% unit test coverage" can be useful, as although this is not a fail-safe way to make sure that we are testing the right things, we can at least be certain that if our metrics indicate we meet the benchmark, most of the important stuff will be tested.
There are several tools available that will provide this sort of metric on a .NET codebase - including NDepend, NCover, etc.

TestDriven.NET vs ReSharper

TestDriven.NET is a tool that is primary intended to support unit test development, but it has a lot of overlap with ReSharper.
If you're already using ReSharper currently (as many .NET development teams do) - and given the fact that ReSharper generally has superior functionality where there is functionality overlap with TestDriven.NET - there is not a lot of value for us in having an additional tool for this purpose.
TestDriven.NET however supports one function that ReShaper lacks - which is unit test code coverage analysis - TestDriven.NET has a built-in instance of NCover, which can be accessed like this:
[Getting to NCover from TestDriven.NET]
When NCover is run across a unit test library, it will run all your tests and provide a code coverage report (percentage coverage of codebase by unit tests), that looks something like this (see the Output window of VS2010):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
------ Test started: Assembly: UpdateManagerUnitTests.dll ------ 

2014-06-19 16:18:29,366 [TestRunnerThread] ERROR Logger XxxMetadata.LoadFile - ERROR:
System.ArgumentException: Empty path name is not legal.
   at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share)
   at UpdateManager.Logic.XxxMetadata.LoadFile() in E:\UpdateManager\Logic\XxxMetadata.cs:line 56
2014-06-19 16:18:29,423 [TestRunnerThread] ERROR Logger XxxMetadata.XxxMetadata - LoadFile FAILED

[etc...]

19 passed, 0 failed, 0 skipped, took 10.13 seconds (NUnit 2.6.1).
Following that NCover will open an interactive explorer app/window called “NCoverExplorer” – like this:

[NCoverExplorer window]
Using NCoverExplorer, you can browse the codebase (by namespace) under test that has been analysed by NCover and establish the percentage that each assembly, namespace, class and event method/property is covered by unit test.

[NCoverExplorer window - browse to assembly]

[NCoverExplorer window - browse to namespace, etc]

Summary

Setting and adhering to a code coverage benchmark (and tooling up accordingly) is a step along the way to establishing a mature, modern software development practice.
Adopting unit testing as a practice by itself is extremely important. However, code coverage benchmarking and analysis provides the on-going measurement and reporting that is required to be able to sustain the practice.
TestDriven.NET and NCover provide a way to get started with adopting and learning this practice relatively cheaply. TestDriven.NET is free for personal use.

Gotcha!

When TestDriven.NET is installed over the top of ReSharper, it can sometimes "replace" some of ReSharper's very nice unit testing automation functions (such as those neat little green dots on the left side of the text editor that will launch a test for you and indicate the most recent result).

[ReSharper - neat little green dots for running tests]
Never fear - in order to get ReSharper's functionality back, just re-install ReSharper back over the top of TestDriven.NET. It's a bit of a "sledge hammer" approach, but you only need to do it once, and it works.

References

TestDriven.NET: http://testdriven.net/


Friday, 2 January 2015

The importance of improvisation

Preamble, background, etc

Scrum is a process improvement framework that has grown up within the world of software development and/but is increasingly used not only within the world of software, but outside of it also. Scrum in-fact lends itself well to product development in general. This post focuses on Scrum (and frameworks like it); in particular how it copes (or doesn't cope) with significant organisational change and where/when improvisation becomes more important than a framework.

The following links provide some information about Scrum:

An acquaintance of mine recently asked an interesting question about Scrum

...which was essentially: 

"What do you do when an established Scrum team encounters a serious problem?" 





["What do you do...what do you do...?" - source: http://chaffeenguyen.com/win-win-situations/]

As a process improvement framework, Scrum is able to cope effectively with most of the day-to-day problems that an organisation may face. Occasionally though the problem is simply a mismatch for the framework - just too big. For example; the business has changed hands, management has changed and/or the business' strategic direction changes.


Scrum works great when business is relatively smooth; it surfaces problems to be resolved by the team/business, etc. Beyond a certain size/volume of problem(s) though Scrum seems to become impractical. What emerges then is a need for the team to lean more heavily on it's ability to self organise and improvise - outside of any framework - in order to be able to recover effectively or perhaps survive. For this reason, I have come to think that drilling a Scrum regime across an organisation is not necessarily always best for business. 

Strategy and the importance of improvisation

To reach their true potential, teams need to have the freedom to explore, develop and optimise their own capabilities within themselves and outside of any set framework. Rigorous adherence to a framework can mar the natural and relatively fragile team-gelling process. Another trade-off to rigorous adherence to a framework is that a team's improvisation skills become rusty. And the ability to improvise in particular is key not only in emergency situations but also (arguably, more-so) in day-to-day business.




[Miles Davis; exemplary improviser - source: http://www.fastcompany.com/3000340/if-miles-davis-taught-your-office-improvise]

Furthermore, in reality, business is not about trying to get things to go as smoothly as possible - to survive and be prosperous, a business must seek out and overcome difficult challenges. If we focus on "getting to smooth" then we invariably begin to steer clear of challenges (AKA opportunities). 


A successful business (or one that at least plans to succeed) will have strategic direction and in alignment with that strategy some challenges will need to be avoided. More importantly though a business' strategy ideally will generally - and openly - identify the type of challenge that it wants to line-up and engage. That way at least everyone knows what the intention and direction of the business is and can have a feel for what's right and what's not. 

Of course, engaging and avoiding challenge is a balancing act that needs to take in to account resourcing, scheduling, cashflow, etc. Shying away from challenge purely in the interest of maintaining stability though is essentially laziness or myopia. Of course, this is deadly in business - remember Kodak? MySpace? Etc. 

Where am I going with this in terms of the framework...? Well, from a strategic perspective, the framework is simply a tool that's there to help a business reach it's strategic objective. Interestingly, working within the framework it can be difficult to identify if/when the framework itself has become a problem. It can happen though, especially in times of significant organisational change - a framework can in-fact be used as a shelter; a means to avoid or delay dealing with organisational change openly. In this situation things can become fuzzy and political - so I won't take this line of thought any further. I wanted to at least identify this point as part of this post though.

Is the framework still relevant?

So, certainly not suggesting that Scrum and frameworks like it are rubbish - far from it - they can help new teams get a great kick-start and can provide a clear ramp-up for organisations new to Agile. 




[Bamboo scaffolding, Hong Kong - source: http://www.archdaily.com/tag/architectural-photography/]

What I am suggesting is that beyond a certain point - whether it's due to an organisation reaching a certain level of maturity or an unavoidable and significant organisational change - the framework may become irrelevant and perhaps even cumbersome. It's worth being aware of this as a possible scenario, and if possible maintaining a general "feel" for a framework's relevance within the organisation.

Agility/stability

In my opinion, Agile essentially still holds the key in times of challenge and uncertainty. Outside of any framework, the basic principles of the Agile Manifesto (http://agilemanifesto.org/) seem to hold true; providing a simple, value-oriented sounding board when things get awkward/ambiguous, and a clear guide back to stability - until the next time!