kwblog: 2010

Monday, November 29, 2010

Making TDD Work

The key to making TDD work is to follow three principles:
1. Write SOLID style, clean code
2. Focus on behavior
3. Always start with the tests

Learning TDD (Test Driven Development) has been a surprisingly difficult process for me over the last few years. Once I figured out how to think like a TDDer I was pretty well on my way, but it didn't really end there. The main issue is that bad tests can be worse than no tests at all. The hard part is figuring out what makes a test bad and what makes a test good. This post is my attempt to convey what I currently think are some good guidelines for writing good tests.

SOLID and Clean
All the code you write has to be good clean code following the SOLID principles. Whether the code is a test or "real" code, it has to be good. If the code isn't clean, well designed, and easy to understand you don't stand a chance.

Some of the things SOLID and Clean Code encourage seem counter intuitive at first. For example, the concept of creating small single purpose methods, especially when they aren't intended to be reused, seems like it might make it harder to find things, name things, and understand things. Of course, the truth is the exact opposite. Creating small, single purpose, well named methods dramatically simplifies your code. The same is true for creating small, well defined classes. If you are at all like me you'll have to see it in action to really believe it. Even today there are times when I have the mental debate of whether to apply a method refactoring, and every time I do it, the code always ends up better.

Following these principles is absolutely essential to making TDD work. It is impossible to test big ball of mud code, it's difficult to test code that isn't single purposed and well defined, and it's enormously difficult to understand tests written against a code base that isn't SOLID and Clean. The more understandable the code under test is, the more understandable the tests will be. In fact, I firmly believe that if you're doing it right, the code under test should end up being so simple you will find yourself questioning if the tests are worth having at all!

Behavior
The tests should focus on the behavior of the system (Behavior Driven Development or BDD). In fact, I would go so far as to say that if you're thinking about code coverage when you're writing tests, you're doing it wrong. Thinking about code coverage leads to tests with names that can't be understood without an intimate knowledge of the implementation and which are a nightmare to maintain. Code coverage can be a useful metric, but it shouldn't guide your tests.

If you instead focus on the behavior of the object you are testing, you should still discover that you approach 100% code coverage, but you will do so in a way that results in meaningful, descriptive, and understandable tests. This is critically important, because tests are not something you write once and never deal with again! Naming and organization is crucial here. Typically I follow the When, With, Should naming pattern in my unit tests. For example, I may have tests like:

[Test]
public void WhenTransferingMoneyBetweenAccounts()
{
  var srcAccount = WithAccountHavingBalance( 100 );
  var destAccount = WithAccountHavingBalance( 100 );

  accountTransferService.Transfer( 50, srcAccount, destAccount );

  Assert.AreEqual( 50, srcAccount.Balance, "Source account should have balance of 50" );
  Assert.AreEqual( 150, destAccount.Balance, "Destination account should have balance of 150" );
}

[Test]
public void WhenTransferingAmountLargerThanSourceBalance() {...}

Each test represents a certain scenario. The first block of code sets up the scenario. The second block performs the action we're testing. The third block verifies everything worked as expected. This style, inspired in part by mspec and rspec, makes each test's responsibility obvious and helps describe all the scenarios and their expected behavior.

You know you're done testing when you've covered all the scenarios. And if you find a bug, it should be fairly straight forward to decide what scenario or test is missing.

Tests First
Finally, you should always start with the tests. When you're writing something new, start with the tests. When you're fixing a bug, start with the tests. When you're adding new behavior, start with the tests. When you're doing a code review, start with the tests. When you're trying to understand how some code works or what its for, start with the tests.

There are times when you wont be able to fully implement the tests first. For example, if you're writing code that uploads a file in ASP.NET MVC and you don't know how MVC provides the file data, you wont know what to mock or what to expect until you dive into the code. That's totally OK. But you should still frame out the tests first, then come back and implement them (just make sure that the tests fail when expected).

Writing the tests first forces you to think through the behavior you're attempting to build and often helps you see flaws in your object design and algorithms. Writing the tests first will also encourage you to write single purpose, clean, and well defined code. Helping you arrive at that point where the code under test is so simple the tests seem almost pointless. It's possible to end up with code that is simple without tests, but it's much harder.

When modifying existing code, starting with the tests will help you notice when the design has changed enough to warrant refactoring the code and redefining the roles of your objects.

Tuesday, October 12, 2010

Vim: Escape

UPDATE: Scroll all the way to the bottom for the best VIM ESC mapping in the world!

Vim is a modal editor, meaning it has modes. It's modes are command mode, and edit mode (AKA normal mode and insert mode). Edit mode allows you to add and delete text and move your cursor around. Command mode allows you to execute various commands that can operate on the text.

Most editors only have edit mode. To execute commands you use shortcut keys that must depend on non-printable keys like CTRL, ALT, SHIFT. You quickly run out of simple combinations (ex: CTRL+S), so you have to introduce more complicated key combinations (ex: CTRL+K, CTRL+C). Visual Studio calls these "chords." I call it keyboard gymnastics.

Some people like the "chord" approach, and other people prefer the mode approach. I like the mode approach because if I'm going to execute non-trivial commands, I'd like to type a command. The main downside to the mode approach is you have to get back and forth between the modes. When you launch Vim and open a file you start out in command mode. There are a bunch of useful different ways to get into edit mode, all of which are just a single keystroke: i, a, o, O, I, A.

Once in edit mode, getting back to command mode is a bit harder because now typing a key will print the character, so Vim uses ESC to leave edit mode and go back to command mode. Sadly, on modern keyboards this goes against everything Vim stands for because the ESC key is so far away from the home row. And if you're using Vim properly, you'll be going back and forth between these modes A LOT. So we need a better way.

Vim has an alternative built right in: CTRL+[

(You've mapped your Caps Lock key to Control, right?)

This is kind of an uncomfortable keystroke, but with practice you will get used to it. If you'd rather not, you CAN map your own keyboard combination to Esc. For example, you could try CTRL+space:

inoremap <C-space> <Esc>

Whatever you use, I strongly suggest you try to stop using the Escape key.

UPDATE:
A number of people in the comments suggested using various combinations of the j and k keys. I didn't even realize you could remap printable characters, but you can! I've been using this for quite a while now and I'm completely addicted! I call it the "smash":

Add this mapping to your vimrc:

inoremap jk <esc>
inoremap kj <esc>

Now all you have to do is smash the j and k key. It doesn't matter which you type first! Just smack them both as if it was a single key! Absolutely brilliant!

Tuesday, September 28, 2010

Questioning ORM Assumptions

I come from a stored procedure background to data access, with output parameters and datatables strewn throughout C# code. I have "recently" been learning ORMs (specifically NHibernate, and Entity Framework). I've done some prototyping with both, and used NHibernate on a few projects.

In 1-1 situations I have found it is incredibly nice to not have to write SQL, or virtually any data access code at all. In situations that required some form of mapping (components, inheritance, etc) it's also very nice, though things become more brittle and error prone. In fact, even in 1-1 situations, I've been surprised by how brittle NH mappings are. Change just about anything on your entity and you're likely to break your mapping somehow. But that seems to be a price it is worth paying to avoid writing SQL and manual mapping code.

However, I've recently been questioning some of the features ORMs bring. I think most people would consider these features absolute requirements of an ORM. However, I'm beginning to doubt how valuable they really are. Perhaps some of this is in reality more trouble than it's worth?

Unit of Work

The first pattern I have some issues with is the Unit of Work pattern. This is the pattern used by ORMs to allow you to get a bunch of objects from the ORM, make any changes you want, and then just tell the ORM to save. The ORM figures out what you changed, and takes care of it. There are two major benefits to this pattern:
1. You don't have to manually keep track of all the objects you changed in order to save them. The ORM will just know what you changed, and make sure it gets persisted.
2. You don't have to concern yourself with the order things get saved in. The ORM will automatically figure it out for you.

My first issue with this pattern is that it is not very intuitive. You have to tell the ORM about new objects, and you to tell it to delete objects, but you don't have to tell it to update objects. And, in fact, you don't have to tell it about ALL new objects as it will automatically insert some of them depending on how your mappings and objects are setup (Parent/Child relationships, for examples). It tends to be further confused by the APIs frequently used. For example, a lot of people use a Repository/Unit of Work pattern to hide NHibernate's session object.

var crypto = BookRepo.GetByTitle( "Cryptonomicon" );
crpto.Rating = 5;

var ender = new Book { Title = "Ender's Game", Author = "Orsan Scott Card" };
BookRepo.Add( ender );

uow.Save();

What happens at BookRepo.Add( ender )? Does that issue an Insert to the database? Is the crypto.Rating update saved? And where the heck did this uow object come from and what relationship does it have with the BookRepo?! If you know this pattern, you're probably so used to it that it doesn't seem strange. But when you step back from it, I think you'll agree this is a pretty bizarre API.

Truth be told, some of this confusion is actually due to the Repository pattern. You are supposed to think of a Repository as an in memory collection of objects. The persistence is under the covers magic. If you're writing an application where persistence is one of the primary concerns, I always thought it was kind of stupid to adopt a pattern which tries to pretend that persistence isn't happening...

But back to Unit of Work, the second issue I have is a certain loss of control. It is very easy for you to write code using a unit of work and then have no idea what is actually being saved to the database when you issue the Save command. To me, that's a really scary thing. Now, to be fair, if you find yourself with code like that, it's probably really bad code. But that doesn't change the fact that this pattern almost encourages it. There is something nice about ActiveRecord's approach of calling Save on each entity you want to save to the database. You're certainly gaining back control.

My last issue, and this one isn't really that big of a deal, but is still something that bothers me a bit... The Unit of Work pattern couples the way you make changes to the transactions that are used to save them. In other words, you can't change object A and object B, then save A in one transaction and B in another. Instead, you'd have to change A, save it, change B, save it. Like I said, this is a minor sort of quibble, but demonstrates again the assumptions made by the UoW pattern which steals some of your control.

None of these issues are all that serious. But I still believe that Unit of Work is a very awakward way of dealing with your objects and persistence.

Lazy Loading

ORMs use Lazy Loading to combat the "Object Web" problem. The object web problem arises when you have entities that reference other entities that reference other entities that reference other entities that ... How do you load a single object in that web without loading the ENTIRE web? Lazy Loading solves the problem by not loading all the references up front. It instead loads them only when you ask for them.

NHibernate and Entity Framework use some pretty advanced and somewhat scary "dynamic proxy" techniques to accomplish this. Basically they inherit from your class at run time and change the implementation of your reference properties so they can intercept when they are accessed. There are some scenarios where this dynamic inheritance can cause you problems, but by and large it works and you can pretend its not even happening.

Lazy loading as a technique is very valueable. But I think ORMs depend on it too heavily. The problem with Lazy Loading is performance. Its easy to write code that looks like it executes a single query to the database, but in reality ends up executing 10 or more. At the extreme you have the N+1 select problem. Once again, it boils down to trying to pretend the data access isn't happening.

DDD's solution to the Object Web problem is Aggregates. An Aggregate is a group of entities. The assumption is that when you load an Entity all its members will be loaded. If you want to access another aggregate, then you have to query for it. This cleanly defines when you can use an object traversal, and when you need to execute a query. Basically, it forces you to remove some of the links in your object web.

By making Lazy Loading so easy, ORMs kind of encourage you to build large object webs. Entity Framework in particular because it's designer will automatically make your objects mimic the database if you use the db-first approach and drag and drop your tables into the designer. Meaning you will have every association and every direction included in your model.

While I don't have a problem with Lazy Loading, I do have a problem with using it too much. This is the main reason why you read so much about people "profiling" their ORM applications and discovering crazy performance problems. Personally, I'd rather put some thought into how I'm going to get my data from the persistance store up front then have to come back after the fact and waste tons of time trying to find all the areas where my app is executing a crazy number of queries needlessly.

Object Caching

NHibernate and Entity Framework keep a cache of the objects they load. So if you ask for the same object twice, they'll be sure to give you the same instance of the object both times. This prevents you from having two different versions of the same object in memory at the same time. If you think about that for awhile, I'm sure you'll come up with all kinds of horror scenarios you could get into if you had two representations of the same object.

But I think this is an example of the ORM protecting me from myself too much, its just not that important of a feature. Instead it adds more magic that makes the data access of my application even harder to understand. One time when I say GetById( 1 ), it issues a select. But the next time it doesn't. So if I actually wanted it to (to get the latest data for example), I now have to call Refresh()...

Wrap Up

I got into all this because I didn't want to write SQL and I didn't want to write manual mappings. I certainly got that. But I also got Unit of Work, Lazy Loading, and Implicit Caching. None of which I actually NEED and certainly never wanted. And many of which actually create more problems than I had before!

Some Active Record implementations manage to fix these issues. But I have concerns with using Active Record on DDD like code. The main concern is that I want to model my domain, not my database. The other big concern is I prefer keeping query definitions out of the entities, as it doesn't feel like their responsibility.

Now I'm not claiming any of these issues are a deal breaker to using NHibernate or Entity Framework or other ORMs. But on the other hand, it doesn't feel like these patterns are the best possible approach. I suspect there are alternative ways of thinking about Object Relational Mapping which may have some subtle affects on how we code data access and lead to better applications, developed more efficiently. For now though, I'm settling for NHibernate.

Tuesday, September 21, 2010

Decoupling tests with .NET 4

Recently, I was struggling with an annoying smell in some tests I was writing and found a way to use optional and default parameters to decouple my tests from the object under test's constructor. Not too long ago, Rob Conery wrote about using C#'s dynamic keyword to do all kinds of weird stuff. When I was running into these issues with those tests, I took a look at what he'd been playing with. Nothing there jumped out, but it lead to the optional parameters.

Specifically, I was TDDing an object that had many dependencies injected through the constructor. Basically what happened was each new test introduced a new dependency, which causes the previous tests to have to be updated. For example:

[Test]
public void Test1()
{
  var testobj = new Testobj( new Mock<ISomething>().Object );
}

I'm using moq here, and creating a default partial mock, which takes care of itself.

Then I write the next test:

[Test]
public void Test2()
{
  var setupMock = new Mock<ISomething>();
  // setup the mock
  var testobj = new Testobj( setupMock.Object, new Mock<ISomethingElse>().Object );
}

Notice this test has introduced a new dependency, ISomethingElse. Now the first test wont compile, we have to go update it and add the mock for ISomethingElse. This will continue with each test that introduces a new dependency causing every previous test to be updated.

You could simply refactor the constructor into a helper method so you only have to change it in one place. But this doesn't work so well when the tests are passing in their own mocks. You'd need lots of helper methods with lots of different method overloads. Enter optional and default parameters!

public Testobj BuildTestobj(Mock<ISomething> something = null, Mock<ISomethingElse> somethingElse = null )
{
  return new Testobj(
    ( something ?? new Mock<ISomething>() ).Object,
    ( somethingElse ?? new Mock<ISomethingElse>() ).Object );
}

Now we can update the tests:

[Test]
public void Test()
{
  var testobj = BuildTestobj();
}

[Test]
public void Test2()
{
  var setupMock = new Mock<ISomething>();
  // setup the mock  var testobj = BuildTestobj( something = setupMock );
}

Simple, clean, refactor friendly, and your tests are now nicely decoupled from the constructor's method signature!

Friday, August 6, 2010

Testing C# with RSpec and Ruby

Why would you want to do this?

Simple: More readability, less ceremony. This means you can write your tests faster, update them faster, understand them faster, and generally just be happier!

Why would you NOT want to do this?

Simple: Microsoft just dropped support for it...

I just spent the last week working on this stuff, so I'm just a little bit pissed at my timing. For example, this whole blog post was already written! Whether we should allow this to prevent us from considering using IronRuby is whole different issue. I think I'll just have to wait and see what happens before making that decision.

That said...

How do you do it?

I have some sample code you can browse to help you get started at http://bitbucket.org/kberridge/irspec.

I'm working with IronRuby 1.1.0.0 on .NET 4.0 and rspec 1.3.0. Know that if your versions are different, stuff may work differently.

Setting up the environment:

Install IronRuby (install it in C: instead of Program Files if you don't want to mess with your path to get gems working later on)
igem install rspec
optionally: igem install rake
optionally: igem install caricature
optionally: igem install flexmock

Starting to test your .NET code:

Add a Specs folder (call it whatever you want) in your project's main folder (or wherever)
Create spec_helper.rb in the Specs folder, more on this in a bit
Create an examples folder (call it whatever you want)
Add your test files in the examples folder
All of your tests should require 'spec_helper'

Executing your tests:

You can execute one test at a time by executing this command in the Specs folder:

ir -S spec examples\first_test.rb

To execute all the tests you can write a rakefile like this:

require 'rake'
require 'spec/rake/spectask'

desc "Runs all examples"
Spec::Rake::SpecTask.new('examples') do |t|
t.spec_files = FileList['examples/**/*.rb']
end

That last bit works great, unless you're executing IronRuby with the PrivateBinding flag turned on... More on that later.

More on spec_helper.rb

If you've never played with ruby before, this may be slightly weird. All of your tests will require 'spec_helper'. This effectively executes the code in spec_helper.rb (but only 1 time) allowing us to centralize configuration required by our tests, or even define helpful helper methods.

There are two things you should definitely do here: tell Ruby where to find your .NET assemblies and require common dependencies all tests will need.

Telling Ruby where to find your .NET assemblies is similar to telling cmd what directories to search by adding to your PATH variable. You do this in ruby by appending to the $: magic variable (aliased $LOAD_PATH):

$: << '../Src/Model/Model/bin/Debug'

There are other ways to find your .NET assembly without modifying the load path, but I kind of like this approach.

Common stuff you'll want to require includes rubygems, spec, and your .NET dll you're trying to test:

require 'rubygems'
require 'spec'
require 'Model.dll'

More on PrivateBinding
If you're writing DDD style code in .NET, you probably have internal constructors and things which you need to be able to execute with your tests. If you were testing with NUnit, you'd setup your test assembly as a friend assembly of the assembly under test. assembly.

You can't do this when you're testing with IronRuby because there IS no assembly. So instead, you have to invoke IronRuby with the PrivateBinding flag as follows:

ir -X:PrivateBinding ...

This works great for accessing your internals or privates in .NET. But sadly, there is currently a bug somewhere that causes rake to break. I posted to StackOverflow here but haven't found a solution yet. So be aware of that.

Monday, July 12, 2010

The Analysts Dilemma

What's the hardest part of software development?

Too vague? Lets make it multiple choice:
A. Architecture
B. Code design
C. Algorithms
D. Business Analysis
E. Data Structures

If you answered anything other than D then you're an idiot. Seriously, look at the title of the post! How could you NOT know that D was the right answer. This isn't some open debate, this is more like high school, and I'm the teacher on this blog, and whatever I say is the right answer is the right answer. It doesn't matter what you think! Much like the relationship an analyst has with the customer.

This is what makes analysis the hardest part of software development. You really really want everything to be lined up in nice neat logical rows so that you can build the software in nice neat modules. But those damn users just refuse to do things logically and neatly! And despite how much you try to have it your way, you just keep getting Cs and Ds. Ultimately you have to give in and just give the users what they want. Embrace the wrinkles, the complexity, and the real world.

This is the picture of the world usually painted by Agile and DDD, and it's almost correct. Because it is true:

The real world is complicated
You can't dramatically simplify how your users work
You have to make your users happy

Don Norman, author of the great book The Design of Everyday Things, talks about this in his Business of Software talk. As he says there, the real world is NOT logical. But then he goes on to talk about what makes Analysis really hard: you can't trust what your users tell you.

That may sound harsh, but no matter how you cut it, it's true. Don Norman doesn't come out and say that, but he tells a story which I've seen happen first hand many times. If you ask people how they do their office work, and you write down everything they say, and then you read it back to them, they will completely agree with its accuracy. But when you go and watch them actually doing the work, you'll see that what they told you isn't what they're doing. If you ask why, the usual answer is because they are dealing with a special case. "We usually do it that way, but in this case I have to..."

So not only are we stuck analyzing something seemingly illogical that we can't force into a logical mold, we also can't rely on being given fully accurate information from the only people we can get information from! We. Are. Screwed.

And believe it or not, I can make it even more difficult for us! Because frequently the introduction of software doesn't just automate the manual process people have always performed, it actually changes the process. Meaning that as you go, you're making things that were once true, false. It's got some quantum mechanics flavor there.

So what are we supposed to do. The first thought is to try to get more accurate information up front, but this will never succeed. There will always be an edge case someone didn't think of. And trying to drill into nitty gritty details without anything solid to build on leads you to become focused on things that don't matter, and over design. Ultimately wasting time, and making it harder for you to respond to change when things inevitably do change.

Instead you have to do one, or both, of the following:

Teach your [users, customers, product owners, domain experts, etc] about the software side of things and get them intimately involved in the design and development of every aspect of the software. From architecture to process to UIs.
Aggressively shorten the feedback loop in any way possible. Get your designs, prototypes, early implementations, betas, and releases in the hands of users and make them work with them as quickly as you possibly can.

This is why it is so so so important to write agile code! "Agile" code is code which is easy to change, to some degree of easy. Once we embrace the fact that the hardest part of software development is analysis, and that truth be told analysis is basically impossible, we realize the most important thing for our code is to be able to respond to change. This has some dramatic implications on how we approach code: understanding becomes more important than execution or writing speed. This is why DDD, BDD, and SOLID are so important!

In the end, we have to stop thinking of analysis as something that happens once at the beginning of a project. Instead we have to minimize how much time we spend up front, and actually use our code as a tool to help figure out what the customer actually needs the software to do. We have to get to the point where learning something new from a [customer, user, product owner, domain expert, etc] doesn't cause us to grumble and complain about how no one ever tells us the right stuff. Its time we owned up to the fact that this is how the real world works, and stop moaning about it, and start expecting it, and finding ways to turn it to our advantage.

Wednesday, June 30, 2010

Powershell: Add an extension to every file in a directory

It's been awhile since I've posted up a Powershell script. Powershell really is great, and I really don't use it enough. I should start using it all the time for any ridiculous thing I can think of just so I can polish up my skills so when that once in a blue moon, "holy crap! do something complicated quick!" situation comes up I'll be ready...

This time I downloaded a bunch of mp3s that were on Google docs. Google docs is awesome, it lets you select a bunch of files, and download them all. Then it zips them up so you can download them all together. Unfortunately, when I unzipped them, they didn't have a file extension...

dir | % { mv $_.FullName ( $_.Name + ".mp3" ) }

That command gets every file in the directory ( dir ), sends them to the next command ( | ), which loops over them one at a time ( %, is short hand for foreach-object ), then executes the move command ( mv ) with the file's full name as the first argument ( $_.FullName, $_ is magically set to the current loop item ), and the file's name with .mp3 tacked onto the end as the second argument ( $_.Name + ".mp3" ). The parentheses tell it to evaluate the expression and pass the result as the second argument.

Beautifully simple.

Need to figure out what properties are available on the objects you get from the dir command?

dir | get-member

Need to figure out what the value of one of those properties will actually be?

dir | % { $_.FullName }

Gotta love Powershell!

Monday, June 28, 2010

Simplicity vs. Adaptability

When dealing with code, and code architecture and design, there are lots of factors which have to be weighed to determine what the best way to go is. The popularity of Ruby on Rails in the blogosphere and the conference circuit has begun to shift the conversation about these factors a bit. When we were focused on Java and .NET we spent lots of time talking about The Gang of Four, Fowler, SOLID, and SOA. These days we seem to be talking more about BDD, "simplicity", terseness, and productivity.

I think this is a good thing, but I also think it's because we have finally realized that we are writing a whole new class of applications now. Back in the day, people were focused on BIG and COMPLICATED applications for banking and shipping and other complicated industries. We are still doing that kind of work today of course, but we've added a whole new class of application that didn't exist before: small and simple web applications. These websites are actual applications, not just brochure sites, so they have logic and models and all the rest. But their domain tends to be small, and the rules tend to be much simpler.

It seems like what we've learned as an industry is that all the patterns and practices that have been developed for dealing with large and complicated systems aren't necessarily needed for smaller web applications. But many of these things have become so ingrained in the way that we think and the way we approach problems that it can be a rather jarring shift to throw them out.

Ultimately this comes down to a question of Simplicity vs. Adaptability.
Simplicity: straight forward, few layers
Adaptability: Guards against change, includes more abstraction

Are These Really Opposing Characteristics?
The best possible design would be one that is both simple and adaptable, but usually simplicity and adaptability are opposing characteristics. This is because to make something adaptable, you tend to have to build in more layers and more abstraction, and that necessarily makes it more complicated.

For example, the Active Record pattern is simpler than the Data Mapper pattern. But the Data Mapper pattern isolates your models from changes in the database and vice versa, as well as removing all persistence knowledge from the models themselves.

There is a fun catch-22 here though. If your abstractions can serve as effective methaphors, you can begin to ignore the complexity the abstraction hides from you. This allows you to think about the system in a much simpler way, even though the details of it are very complex. But its debatable whether we would call a system like this "simple." For example, websites written with Ruby on Rails tend to be simple, but I would not describe Rails itself as simple.

Simplicity is exemplified by DHH, creator of Ruby on Rails
Adaptability is exemplified by Jeremy D Miller, creator of FubuMVC

Just look at the difference in their tag lines:
"Ruby on Rails is an open-source web framework that's optimized for programmer happiness and sustainable productivity."
FubuMVC: "Compositional, compile safe, convention-based configuration for complex web applications."

The focus of these two projects is clearly different, and its hard to argue with either one. Both can be used to create "simple" web applications. But Rails is very focused on a certain subset of simple web apps, where Fubu is more interested in being adaptable in order to allow you to tailor it to your needs.

There is a trade-off here. And which trade is right for your project is one of the most important decisions you have to make. As I wrote recently, the factors you have to consider in this trade-off frequently aren't even technical! So we really need to understand how diverse our industry has become, and we need to understand the context in which these things are being discussed.

Wednesday, June 23, 2010

More Engagement

Are you engaged at work? I don't mean like, engaged to be married. I mean are you ENGAGED?

If not, why not? Are you not paid well? Do you not like the work? Do you not like your coworkers? Do you not like the management? Do you not like your boss? Are you bored? Is the organization keeping you from doing your best work? Do you feel unproductive? Do you feel unable to contribute? Do you feel micromanaged? Why?

Maybe this guy can help explain it:

Maybe your organization's attempts to motivate you are actually, though accidentally, de-motivating you! Isn't it ironic? So what should they be doing? What do you need to be engaged at work? To be at your best, to be your most productive, to be happy?

According to Dan Pink, its as simple as 3 words: Autonomy, Mastery, and Purpose. This observation lines up so nicely with the observations in First, Break All The Rules too! But the problem with these observations is that they are only observations. They provide really useful information, but don't help your boss figure out how he should act day to day.

That's actually one of the main reasons why I think First, Break All The Rules is such an excellent book, and so much better than any other book on "management" I've ever read. There simply is no cookie cutter, one size fits all, set of steps you can follow to create an environment in which everyone is engaged.

Now, if you're the boss you can start putting your understanding of Autonomy, Mastery, and Purpose to work right now. But if you are not the boss, what good is this to you? If you have some "direct reports," you can apply these ideas with them within your own projects. At least that's a start. But if you don't have "direct reports," then it looks like knowing this stuff isn't going to help you at all!

I remember the first time I read Peopleware... On the one hand I absolutely loved it. But on the other, it was just depressing. And I believe that my company would actually rank very highly compared to other companies on these factors. But it didn't matter, it was depressing! That's because this kind of stuff seems too far out of the control or influence of a lowly programmer. What impact can one person have on things like culture? Or the autonomy granted to employees? Or the purpose behind the work? Or big-M vs. little-m methodologies?

I think the answer to these questions is, quite frankly, that on your own you can have very little impact. BUT! I believe a group of like minded people, with common goals, patience, and a dash of determination can get together within any organization and make a huge difference. Those people can become engaged in the struggle to be engaged at work! And, as cheesy as it may be, any group of people starts with just one person. To be successful at this you have to have a broad idea of where you're heading. If all you want to do is be like 37 Signals, you're out of luck. But if you can embrace the real kernel of truth in all the observations found in so many places these days (including what the guys from 37s are saying), and tailor that to your organization's unique goals, strategy, and personality... Then you will be able to make your company, or your division, or even just your team one that all your friends will be jealous of. And everyone will end up doing better work because of it.

So if you're not engaged at work, stop looking up! Stop waiting for someone else to change! Get out there and get started making a difference today. Even if just a small one. Be prepared to lose a lot of battles, but don't let one set back prevent you from continuing to work at it. Because it is worth working at. And you don't have to get 100% of the way there. You don't have to end up with a ROWE to be engaged at work. That's because lots of little improvements will add up. And could maybe even start a steam rolling effect.

The key is to recognize that it's worth striving for, that you don't have to keep looking up waiting for someone else to make the changes, and that its ultimately a community effort.

Thursday, June 3, 2010

Software Craftsmanship

Growing and Fostering Software Craftsmanship from Cory Foy on Vimeo.

"Software Craftsmanship" is kinda sorta like the next "Agile." As a "movement" its not really very interesting. But if you ignore the proselytism you discover that the message is both simple and appealing. Cory Foy does an excellent job of communicating that in this presentation. So much so that I wanted to share it.

Movements tend to fall short because they fail to convey the context of the problems they have been created to solve. Cory does a great job of providing the background of what the typical problems in software engineering are and follows through with clear, but broad, ideas to fix them. This isn't a talk about unit testing, TDD, pair programming, or any other specific techniques. Instead its a talk about the nature of the problem and what the solutions should look like and how you could begin to move in the right direction.

In short this is an inspirational talk. Watch it! I think you'll enjoy it.

Monday, May 24, 2010

Rails has no place at the office

This is a milestone post for me! My first ever purposefully incendiary title!

I should probably run with it and try to get everybody super offended, to the point where you have no idea what my point is because all you can see is red. I guess I'll have to leave that for a future milestone...

Because, yeah, I'm not really serious. Rails has a place at the office. And no, this isn't going to be one of those "Is Rails ready for the Enterprise?" posts. Rails is perfectly ready for use in the Enterprise, but that's the wrong question. As usual, the right question is much more complicated.

To start with, let me point out that this conversation has nothing to do with Ruby vs. C#. It doesn't really have anything to do with Rails vs ASP.NET MVC either. Instead I'm going to be talking about Active Record vs. Data Mapper, and View-Models vs. no View-Models, and this general concept of "the straight and narrow" vs. explicit abstraction and control. These are design patterns which apply to any language and appear in many different frameworks.

Rob Conery recently wrote a blog post in which he said,

For a lot of .NET/Java devs this will look "messy" - you shouldn't elevate "data concerns" into your model. This argument makes good sense for a large, complex site - that you're building in C# or Java. Typically Ruby focuses on the straight, narrow path and with that comes a dramatic turn towards "doing what you need to do... and no more". This resonates with me...

The part about Ruby/Rails focusing on "the straight and narrow path" really struck a chord for me. Ruby, being a dynamic language, is very much on the "straight and narrow." It dispenses with all kinds of things found in strongly typed languages like private, internal, protected, interfaces, etc. These are things that are usually considered very important in a strongly typed language, and practices like DDD, but Ruby doesn't really bother with them. Ruby favors documentation and convention over strict control.

Rails has a similar story. It uses the Active Record pattern for its data access, which requires a 1-1 correspondence with your database. Further, the models don't even really exist! They're built dynamically from the schema of the database tables.

If you compare ASP.NET MVC to Rails one of the differences you'll quickly discover is this concept of a "View-Model". ASP.NET peeps seem to like these, whereas I haven't found a Rails sample anywhere that uses these. Both Active Record and this lack of a View-Model are accomplishing the same thing: removing abstraction in favor of directness and simplicity.

Now lets step back from this for a second and ask a question. Who in there right mind would want to have to deal with things like class and method visibility and extra layers of abstraction, which more often than not appear to be just duplication? No one! No one would want to deal with these things! It's extra work! I _hate_ extra work!

So why do we do it? Why does DDD make a big deal out of private constructors and Factories? Why does Fowler recommend the Data Mapper pattern over Active Record? Why do we create View-Models to separate our Views from our Models? Why do we do all these things that seem to just make life more complicated? Why don't we all take the straight and narrow path on all of our projects all of the time?

Certainly it's not as simple as the language we're using. Just because you're writing in C# and Java doesn't mean you can't use Active Record. And it doesn't mean you can't pass your Model straight to your View. There is also nothing about C# or Java that forces you to use interfaces, or follow the Dependency Inversion Principle. That said, there's also no reason why you couldn't use the Data Mapper pattern in Ruby, or create View-Models. The language certainly HELPS with some of these issues, but it's not the real difference. These are just patterns, and they apply equally well to any language.

The reason why we introduce this complexity and divert from the straight and narrow path in our technical approach is actually due primarily to non-technical reasons. Here are some of the reasons I think lead us to adopt these "enterprise" patterns:

There are more than two or three developers on the project
You have more than 6 entities in the domain
The project has a timeline longer than 3 months
The developers aren't intimately familiar with the domain
The project is likely to grow in fits and starts
The team members are more likely to come and go

These are not technical issues but they have technical IMPLICATIONS!

The practices prescribed by DDD are a big deal if you're working with a large complicated domain with lots of potential for change. If you're not, then you don't need DDD. Fowler's enterprise patterns are a big deal for the same reasons. If you know things are complicated, likely to change, and not possible for everyone on the team to grok completely, then you need to build abstraction into your code. And you need to try to be as explicit as you possibly can about what the code does and how it works. And you need to look for opportunities to prevent error and misunderstanding before it happens. These things will allow you to keep things clean, organized, and ultimately make your project successful when you're faced with "enterprise" challenges.

This is obvious. I'm sure you're sitting there (or standing) thinking, "duh!" or "when is this dude going to get to the point?" or "does this moron really think this is revolutionary?!"

My point is as simple as this. Rails is awesome. Simplicity is awesome. But as I sit here in my ivory tower looking out over the landscape I see lots of quiet subtle backlash from people against the "enterprise-y" patterns in favor of the simplicity of Rails. This makes a lot of sense to me because, as we pointed out, who would WANT to deal with the complexity of enterprise problems and patterns? But it is easy to be tempted by the appeal of simple solutions to simple projects. And certainly we should always strive to find the simplest solution that could possibly work. But we can not close our eyes to the complexities of the problem or the environment in which we are solving the problem. And we cannot allow ourselves to be boiled alive either.

So by all means, choose the right tool for the right job, but make sure you understand the job as well as you understand the tool.

Friday, May 7, 2010

Vim Commenting

Recently there has been some renewed interest in my series of posts on using Vim for C# development, so I thought I should add a few more posts to the series to bring it up to date.

You can find the introduction to this series here.

Visual Studio has a feature which allows you to select a bunch of lines of code and have them all commented out, or uncommented. In my setup it is bound to Ctrl+k+c to comment and Ctrl+k+u to uncomment.

I use this particular feature pretty regularly, so I definitely wanted it in Vim, and wouldn't you know it, there's a plugin for that! NERD Commenter.

With NERD Commenter installed you can select a bunch of lines of code (I typically do something like Vjjjj) and then type either

,cc

,c<space>

The first is the comment command, the second is the toggle command. If you use the comment command you'll need to use

,cu

to uncomment. If you use the toggle command you don't need to remember two commands!

That's all there is to it!

Now, the first step is to visually select the lines you want to comment or uncomment. You can just hit j,k a bunch of times but there are usually better ways. For example, if you want to comment out an entire method:

public void IAmAMethod()
{
  ... lots of lines here...
}

If there are lots of lines in the method, you don't want to be hitting j all day. Instead, with your cursor on the method declaration line, do:

Vj%

The % is the "match" motion. So when your cursor is on the { it will find the matching }.

Alternatively, if you just want to comment out a group of text that is arranged in a paragraph:

public void IAmAMethod()
{
  ...paragraph of code...

  // I want to comment out this block of code
  ...more code...
  ...goes here...
}

For this you can use the "paragraph" motion. With your cursor on the first line of the block, do:

V}

That will select the whole block.

Once again, we have approximated a feature found in Visual Studio, but made it even better with the power Vim!

Wednesday, May 5, 2010

Vim File Navigation

Recently there has been some renewed interest in my series of posts on using Vim for C# development, so I thought I should add a few more posts to the series to bring it up to date.

You can find the introduction to this series here.

In a previous post I talked about how to open and edit files in Vim. That post discusses just the basics of opening files. Since then I've started using the wonderful NERDTree plugin. This plugin opens a small buffer on the left of your Vim window which contains the file system tree. You can then navigate through directories and open files. The nifty part is the NERDTree is just a Vim buffer, so you can navigate with h,j,k,l and you can search with / etc. To see what I'm talking about, you can watch a demo of the NERDTree in action here.

Before I go on I should mention that Vim actually has a built in file system navigation plugin called Netrw. NERDTree adds a few features and is in someways a bit easier to use, but Netrw is capable of doing all of this stuff and it's built right into Vim.

Using NERDTree has completely changed the way I work in Vim. When I'm ready to start working, I navigate to my .sln directory in the terminal and I type

gvim .

This opens Vim with NERDTree showing the contents of the current directory. From here you can navigate around and find the file you want to start editing.

When you open the first file NERDTree will go away. If you want to pull up NERDTree so it's always visible docked to the left of your Vim window you can type

:NERDTree

To toggle it open and closed you use

:NERDToggle

That's kind of a lot to type. So to shorten it up I've mapped it to F2 by adding this to my vimrc:

" toggles NERDTree on and off
map <f2> :NERDTreeToggle<cr>
imap <f2> <esc>:NERDTreeToggle<cr>i

Now hitting F2 will open and close the NERDTree. Fast and easy.

So NERDTree is great for finding and opening files (you can even open files in splits with i for horizontal and s for vertical), but NERDTree can also manipulate the file system.

For example, to add a new file, put your cursor over (or in) the directory you want to add the file to and hit the "m" key. This will open up a menu with some options. Type a to "add a childnode" and then just type in the name of the file. This works for creating directories too. You can also move (and rename) files as well as delete.

Sunday, May 2, 2010

Coding Style Preferences

I recently did a little poll on twitter and in my office to get a feel for what people's coding style preferences were. The fun thing about coding style preferences is that they are completely irrelevant, and yet a topic that people can easily get pretty passionate about.

It was only a little poll, with 35 people responding to these few questions:

Curly braces?

On new line
On same line

Spaces in control statements (if, foreach, etc)?

Spaces outside and in ex: if ( this.HadSomeCandy )
Spaces inside only ex: if( this.HadSomeCandy )
Spaces outside only ex: if (this.HadSomeCandy)
No spaces ex: if(this.HadSomeCandy)

Spaces in method calls?

Spaces outside and in ex: someone.ShouldJustDecide ( "what", "is", "right );
Spaces inside only ex: someone.ShouldJustDecide( "what", "is", "right" );
Spaces outside only ex: someone.ShouldJustDecide ("what", "is", "right");
No spaces ex: someone.ShouldJustDecide("what", "is", "right");

Spaces in method declaration?

Spaces outside and in
Spaces inside only
Spaces outside only
No spaces

Spaces in method calls with no args?

No space ex: someone.ShouldJustDecide();
Space ex: someone.ShouldJustDecide( );

How many spaces in indentation?

And here are the results:

So, clearly, the winner is spaces outside/no spaces as in:

if (who.Cares("about coding style?!");

An interesting observation here is that the people who "don't like spaces" are very consistent in their preferences whereas the people who "do like spaces" are much more varied. This is evident in that of the 22 people who voted for spaces outside only in control statements, 19 also voted for no spaces in method calls.

Note that only 25 people answered the question about spaces in indentation because I added it to the poll later. I expect the results would have been much different because it was the people from my office who didn't get to answer and our internal standard is 2 spaces.

For curly braces it was 22 to 13 in favor of braces on a new line.

There were answers for just about every combination, no matter how weird. For example, some people put spaces outside and in for control statements but no spaces in method calls.

The sample size of this poll is too small to actually mean anything, but it is still interesting that the preferences line up pretty closely with Microsoft's coding style standards. I didn't verify this, but I wonder if this could be influenced by Visual Studio's default code style settings.

Personally I was very much in the minority here. For the last five years I've been a spaces outside and in/spaces inside guy as in:

if ( who.Cares( "about coding styles?!" );

I've also been a 2 spaces guy and if you go back to college I was a curly braces on the same line proponent. I started doing the curly braces on a new line when I started full time at my job. I recently tried curly braces on the same line when I started learning jQuery and I have to admit, I didn't like it anymore. Could be just because my javascript is still pretty ugly though.

I'm also starting to second guess the whole 2 spaces thing. I always preferred it because it made it so you could see more code. But now that I've embraced the SOLID principles, if the lines of code in my methods were so indented as to cause a problem reading them, I'd suspect a "design" problem with that method. And I'm starting to think that 4 spaces would make a pretty big readability difference, since it would be much easier to spot where indentations start and end. I think its especially important if you do curly braces on the end of the line, or if you're writing Python or Haml.

Finally, I always liked the spaces in control flow because I believed it made it easier to read. But when I was preparing this poll I wrote the different styles out side by side and I started to wonder if the spaces actually bring out the "noise" of the different characters... I'm still not sure about this one.

This whole exercise also made we question WHY there is so much possible variation in the languages. Wouldn't it be nice of the details of the language were done in such a way that there was 1 right way to do it and we didn't have to concern ourselves with silly details like where to put spaces?

Monday, April 26, 2010

View-Model design question

Here's a design question for you!

Lets say you are working in ASP.NET MVC 2 (or your favorite MVC web framework). Lets also say you have a nice rich model. Your controllers have to fetch the model objects and get that data to your view. How do you do that?

There are a bunch of ways:

Pass the model directly to the view
Create a "View-Model" class and put a property on it that exposes the model
Create a "View-Model" that completely hides the model behind properties of its own

And there are a bunch of variations on those too. But those are the main options.

The best thing about #1 is it's as simple as can be.

#2 is almost as simple as #1 but adds the ability for you to create other custom properties on the "View-Model" object that can perform various operations for the view. For example, you might format values, or retrieve the latest object from a list, etc.

The downside to these two options is your View is directly coupled to your Model. This might become a problem if you end up with lots of Views that depend on the same Model, or if the Model keeps evolving and being refactored over time.

That's where #3 comes in. By creating all new properties on the View-Model, you're basically applying the Dependency Inversion Principle and saying, "This view needs this data, I don't care where it comes from as long as someone provides it." You will now need some form of mapping layer to get the data from the Model to the properties of the View-Model. This is more work, but it's also nice. When the Model changes, you only need to update the mapping, which is dramatically easier than digging into lots of HTML and finding what could be many references to your properties.

Now, that said, there are still lots of changes that will require you to make changes to the Model, View-Model, and View. Any change that is a change in the *meaning* of the Model will cascade this way. But there is a whole set of changes that won't cause this update cascade. Like any refactoring of the Model for example.

The obvious downside with #3 is more code and more work (Though tools like AutoMapper certainly help).

So, how do you know when to apply which pattern? Is one pattern always better than the others, or does it depend. And if it depends, on what? And how do you know when it's time to switch from one to the other?

Thoughts? Experiences?

Monday, April 19, 2010

SRP and complexity

The Single Responsibility Principle (SRP) is probably the most important concept of good design. But even once you know about it, and have read up on it, and seen countless blog articles describe and reference it, you may find yourself hesitant to actually follow it in real life.

The usual argument against it is that it seems like it might increase the complexity of your code. Lets look at an example of applying SRP to a method.

public void UpdatePrimaryThingStatus( string status )
{
  Thing primaryThing = null;
  foreach( Thing t in something.AllThings )
  {
    if ( t.IsPrimary )
    {
      primaryThing = t;
      break;
    }
  }

  if ( primaryThing != null )
    primaryThing.Status = status;
}

There's nothing _wrong_ with this code, but it doesn't really follow SRP because the method is updating the primary thing's status, as advertised, but it's also finding the primary thing. Lets factor out the finding of the primary thing into its own method:

public void UpdatePrimaryThingStatus( string status )
{
  var t = GetPrimaryThing();
  if ( t != null )
    t.Status = status;
}

public Thing GetPrimaryThing()
{
  foreach( Thing t in something.AllThings )
  {
    if ( t.IsPrimary )
      return t;
  }
  return null;
}

Notice how much code actually disappeared here. And notice how simple each method is. But, we did add a new method to the class. Do we intended to reuse this method? That depends, it IS a useful method that could easily be reused, but since we didn't have it already, lets assume we don't need to reuse it right now. So yes, we simplified the code in the individual methods, but by adding a new method we've increased the complexity of the class.

We should probably ask why adding a new method is a problem. It's only one new method! It's well defined with a single responsibility, with an intention revealing interface, and simple code to boot. Why would we think this is going to increase the complexity of the class? Probably because we're used to working with classes that are thousands of lines long with lots and lots and lots of methods! So yes, if you're applying SRP to your methods, but not to your classes, things might get a little complex. But make sure your classes have a single responsibility, and you'll find that this wont be the case anymore.

OK, so if our classes are following SRP, then we'll be breaking large classes into more smaller classes. But now we have lots of classes! Doesn't that make our code more complex?

This same pattern will follow right up the chain through namespaces and assemblies... is this getting out of control? What's the solution?

The solution is cohesion! You can add lots of small classes as long as they are all part of a cohesive whole. This is actually a really beautiful thing. If your classes are well organized, and obviously form a cohesive unit, you get an amazing benefit. Lets say you need to go into the code and find a bug. You know what area the bug is in, even if you don't know exactly what it is. There may be 20 files that make up your code, but you'll probably only need to crack open 3 or 4 of them to find and fix the bug. And each file you do open will be understandable, dare I say, easily understandable.

You may think there is still a problem in understanding the WHOLE. To understand how it all works, don't you now need to open all these little classes and figure out how they all work together? Yes and No. Again, I think the fact that each class has a clear single responsibility (and therefore an intention revealing interface) means you can actually understand the WHOLE and read less code than if you had it all squished into a single class.

So, whatever you do, don't let fear of complexity drive you away from SRP.

Monday, April 12, 2010

Framework Disease

A lot of software engineers have a Computer Science background. My college education, for example, included the standard things like Data Structures and "Operating Systems." It also included some cool things like Artificial Intelligence, Peer to Peer, Automata Theory, and Evolutionary Computation. I was also fortunate enough to have the opportunity to participate in some research projects as well which included Mobile Agents, Evolutionary Computation, and Swarm computing.

These things are cool.

But now I spend my time figuring out how to get data from a UI into a database and back again. That's basically the bottom line of "Enterprise Application Development". Surprisingly, there are enough challenges in this space to keep you busy for a very long time. And while the topic itself doesn't have a lot of sex appeal, the work is actually amazingly broad, even just from a technical standpoint. And once you add the "business" concerns in, it has the potential to become very interesting indeed.

But still I'm a computer scientist, and I'm inexorably drawn by computer science-y problems. In the Enterprise space, the most computer science-y problems tend to be those of building "frameworks." And what I mean by framework, is re-usable code bases that developers use to avoid having to write the same (or similar) code over and over again.

Frameworks are to some people as cat nip is to cats, or street lights are to moths, drugs to drug addicts, or cigarettes to smokers, or... Some people loooooooove building frameworks. They are always on the look out for an opportunity to build a framework. At the first sign of duplication, or recurring pattern, you can see the light in their eye... framework!

I call this Framework Disease. Frameworks are tricky. They can be huge time savers. And they certainly are fun to work on, since they're so computer science-y. But at the same time, they can be real time wasters.

Sometimes the problem you are trying to solve with a framework simply isn't worth the time it takes to build the framework. This could be because all the framework does is replace some standard boilerplate code that could easily be copied and pasted or generated. In these cases, centralizing the boilerplate can actually be a bad thing because you're forcing every use to be identical forever. Just because they are the same now doesn't mean they always will be, or always should be.

Other times the framework ends up being written in such a way that it actually becomes a problem. This can happen when the framework starts limiting what you can do, or when it continues to grow and grow and grow, or when the complexity of the framework obscures the simplicity of the problem being solved. When this happens you're spending more time working on and fighting with your framework than you are on actually getting things done.

Another problem with frameworks is the tendency to build them too soon. If you set out to write a framework without having seen plenty of examples of what your framework will be replacing, your framework is probably doomed. To be really successful, you have to write the code your framework will replace in a number of different places. If you don't, you'r just guessing about what should be abstracted into the framework. This means you really don't know what should go in the framework, nor where the framework should be flexible or where it should be rigid.

I have personally fallen into all these traps many times, and just about everyone I know has suffered from a bit of Framework Disease at one time or another. It is very contagious.

I think Framework Disease is a symptom of not being connected enough with the goals of the development effort. At Codemash, Mary Poppendieck told a little parable that went something like this:

A philosopher walked into a quarry and saw three people working with pickaxes. He walked up to the first man and asked him, "What are you doing?" The man irritably looked up and said, "I'm cutting stone, what the hell does it look like?!" The philosopher moved on to the second man, asking the same question. "I'm making a living for my family." Finally the philosopher asked the third man, who responded, "I'm building a Cathedral!"

The third guy clearly understood the context of his work. I think a lack of understanding of the context of work is frequently what leads to Framework Disease. Passionate people in particular are susceptible to this. Without a broad understanding of why you are doing what you are doing every day, how can you possibly stay focused on the important things? How can you possibly stay energized?

So if you find yourself exhibiting the symptoms of Framework Disease, step back and ask yourself, "If I'm not building a Cathedral, what am I building? And does this framework really further that goal?"

Monday, April 5, 2010

Passion

I think there are two qualities that set really great developers apart:

Technical competence
Passion

Pretty much in that order. If Bob has strong tech skills, it means he can solve complicated problems to a reasonable level of quality independently. But if he is lacking passion, it means he wont be looking for opportunities to improve, or to push the envelope on issues like code quality, cleanliness, productivity, etc.

If Bill has a lot of passion, it means he'll be on the lookout for ways to improve. Both in his own work, and his team's work. But if Bill doesn't have the technical competence to back it up it means he's simply unreliable. The impression will be that he pays lots of lip service to quality and improvement but never manages to actually deliver any.

Bennie, on the other hand, might have both of these qualities. This makes him reliable and constantly improving. And not just improving himself, but improving those around him. Bennie is the guy who's likely to not only complain about some "policy" his office has that he feels is hurting more than helping, but to actually work on getting that policy changed. Effectively, Bennie is a leader.

I think people with passion naturally end up leading, regardless of whether they are in "leadership" roles. You don't have to be the boss to influence how your company works. And you don't have to be the team lead to influence what technologies get used and how.

However it is this fact that makes passion a double edged sword.

Passionate people are more likely to challenge the status quo. Which is good. Unless they are challenging it in ways that actually hurt their team or hurt their company. This runs along the same lines as some issues I discussed in an earlier post called Engaged Employees. In a nut shell, if the priorities of the passionate people don't line up with the priorities of the business, you've got trouble ("With a capital T. And that rhymes with P and that stands for..." Passion?).

When the passionate people become dis-engaged, there are two likely outcomes. They might start "farting around" with "improvements" that don't actually help the team or the business accomplish any of its goals. It is probably still true that these "improvements" are "good" in their own context. But in the context of the business, they might actually be "bad," or possibly just unimportant (and therefore a waste of time). On the other hand, the passionate people might decide to simply checkout. They might decide to put in the minimal amount of effort possible, with an attitude of "screw those guys." If just one person was acting in either of these ways, it wouldn't be that big of a deal. But when it's your passionate people, it can lead to much more trouble. These are your leaders after all, and their attitude and behavior rubs off on everyone else.

A company plagued with dis-engaged passionate individuals (like Bennie) would probably like to trade them all for simply technically compete people (like Bob). The Bobs would stop causing so many problems and stop challenging everything and stop being in such a bad mood all the time and just get their work done.

But we have to ask, what has led the passionate people to be so removed from the goals of the business? I think there is always a simple one word answer to that question: Management. The book First Break All The Rules makes the case that a managers job is simply to bring out the best in his people. And not the best out of context. If you're an amazing yo-yo player but you program for a living it's not your manager's job to bring our your best yo-yoing. Obviously. It's his job to bring out your best in the context of the goals of the business (What a Programmer Wants in a Manager). If your passionate people don't know what the priorities of the business are, or if they can't figure out what they could do today to have the biggest impact on the company, then management has messed up somehow.

But just because management has messed up doesn't mean we get to blame them and call it a day. It doesn't mean we can just give up, stop trying, decide not to care, or adopt a bad attitude. That is the path to the dark side. As Bobbies, we have to do what we've always done: Fix it! This is going to be harder for us than, say, designing a better data layer. This is now a people problem. And as technical nerds, we are probably not the best suited individuals to address people problems. But sometimes we have to embrace the circumstances life throws at us, however uncomfortable they may be, and we have to grow up, step up, and fix it!

Thursday, February 25, 2010

Quality Code

A month ago or so Jeremy D. Miller wrote a blog post where he briefly, but effectively, tackles the issues of why writing good code is important. I've written about this in the past as well. Now, I think this is one of those issues that no one would REALLY argue against, but that we all know lots of people don't FULLY agree with or understand.

I don't think anyone would argue that bad code is better than good code. But I think people (developers and business people) misunderstand how important good code actually is. This issue is very near and dear for me because I have quite a history with bad code. Really bad code... So figuring out what good code looks like is more than just an academic exercise for me.

What Jeremy says is,

Let’s be realistic here, you never have the perfect requirements. Your business partners with the vision will need to iterate and refine their vision, and the entire product team is better off if the development team is technically able to efficiently deliver features that weren’t even imagined at project inception. You can succeed with bad code, but all things being equal, I think you maximize your business’s chances of succeeding by taking software quality very seriously.

This is one of those things that is easy to overlook: change. Especially unexpected change. The "business people" don't know the difference between expected change and unexpected change. Even "business people" who used to be really really smart technical people. Once you're outside the code, you have no idea what magic is needed to make changes. That's why these business people always come to you with this odd look on their face and ask, "Can we do this??" Sometimes you look at them like they're crazy and say, "Duh." But other times you blow up at them, "What?! That wasn't in the original spec!" That's why those business people have that funny look on their face, they never know what to expect. Of course, that's also why some of these business people adopt the attitude of, "I don't want to hear about it, just get it done by tomorrow."

Let me be very clear here, I’m defining software quality as the structural qualities of code structure that enable a team to be productive within that codebase for an extended amount of time.

So that's why code quality is important. The business people should want you to be writing quality code because it means you can respond to their changes with a good attitude and quick turn around time. And you should want to be writing quality code because it means you can deal with those changes without going out of your mind, and without adding more and more hacks into your code.

Jeremy's blog post goes on to list some "qualities" of good code. One of these I think is very important:

Feedback. I think the best way to be successful building software is to assume that everything you do is wrong. We need rapid feedback cycles to find and correct our inevitable mistakes.

He also has a list of links to his MSDN articles on various valuable patterns and principles which you can apply to help keep your code maintainable.

Monday, January 25, 2010

My TDD Struggle

I'm a huge fan of the concept of TDD (Test Driven Development). I've done it a few times with varying success but I intend to make it a constant practice on anything new I write. If you want to see people doing it right, go watch some videos at http://katacasts.com/. Now on to the words!

TDD is the red/green/refactor process. Write the test, watch it fail. Go write the bare minimum of code possible to make it pass. Refactor the code, and refactor the tests. Repeat.

This process lends itself to what people call "emergent design." This is the concept that you don't stress out trying to devise some all encompassing design before you begin coding. You sit down, you write tests, and you let the design emerge from the process. The reasoning here is that you'll end up with the simplest possible design that does exactly what you need and nothing more.

That point hits home very strongly for me because of my experience with both applications and code that have been over designed and end up causing all sorts of long term problems. So the call for simplicity is one I am very eager to answer.

BUT. Clearly you can't just close your eyes and code away and assume it will all work out. There is an interesting tight rope walk happening here. As you are coding you have to be constantly evaluating the design and refactoring to represent the solution in the simplest possible way. But what TDD is really trying to get you to do is not think too much about what is coming next and instead pass the current test as though there wasn't going to be a next test.

It's that "ignoring" the future part that I really struggle with. The knee jerk negative reaction is that this will cost you time because you're constantly re-doing work. There are times when this is probably true, but in general the tests lead you through the solution incrementally, affirming you're on the right track each step of the way. And when you suddenly discover something that causes you to back track, you've got all the tests ready to back you up.

But there are a few things that I don't think this technique is good for. One is "compact" algorithms, the other is large systems. We'll take them one at a time. I was recently practicing the Karate Chop Kata which is a binary search. My testing process went like this:

When array is null or empty it should return negative one
When array has one item

it should return zero if item matches
it should return negative one if item does not match

When array has two items

it should return zero if first item matches
it should return one if second item matches
it should return negative one if nothing matches

When array has three items

Numbers 1-3 were all implemented in the straight forward way you would expect. But when I get to #4, now I have to actually write the binary search algorithm. So now I have to decide if I'm going to write it with a loop, with recursion, with some form of "slices", etc. I also have to figure out what the terminating conditions are and verify that my indexes and increments are all correct. In other words I have to do all the work after writing that one test.

And worse, these tests are stupid. What am I going to do, write a test for every array length and every matching index? I re-factored the tests later to be a bit more generic and more specific to the edge cases of the algorithm in question. If you'd like to see what I ended up with you can checkout the code on bitbucket.

In general, writing your tests with knowledge of the implementation you're writing is bad, bad, bad. Like @mletterle reminded me of on twitter, tests should test the behavior of the code, not the implementation of the code. Bob Martin just recently wrote a post that made the same kind of argument in regards to Mocks.

Now don't get me wrong. The tests are still valuable in this example, they're just not as useful in an "emergent design" kind of way.

Moving on, the second thing that the emergent design mindset isn't very good for is complex system design. Systems that are complicated enough to warrant DDD (Domain Driven Design). In this case you really want to step back and take a big picture view and do a real Domain Model. The emergent design approach may lead to a design with fewer objects or something, but this may not be a good thing if you're interested in a design that excels at communication.

With these systems you'd do your Domain Driven Design, then drop into your TDD and allow it to "emergently" design the actual implementation code of your model. You're kind of getting the best of both worlds this way. But its important to recognize that TDD is still an important part of this, even though you didn't let it guide the ENTIRE design.

So TDD and emergent design might not be the answer in all circumstances. But I still think that you'll find a strong place for it in even these circumstances.

Anybody in the blogosphere strongly disagree with this? Or perhaps, dare I ask, agree?

Monday, January 11, 2010

SCM Trade-Offs

Source Control Management (SCM) is, on the surface, a very simple topic. You put all your source code in one place, then you check in and check out from there. Advanced systems support merging changes if two people edit the same file. And there you go, SCM in a nut shell.

And if you are just one person, or a small team, that's probably where the story ends for you. But with larger teams, or more complicated application environments SCM inevitably morphs into Application Lifecycle Management (ALM). ALM covers topics from requirements, to architecture, to testing, and release management.

I have found that typically it is ALM requirements that end up introducing branching to your SCM. Branching is another simple concept. You make a copy of the source code which can later be "easily" merged back with the original. Unfortunately branching is another area that quickly gets much more complicated than all that (which I've written about before, in a surprisingly humorous post if I may say so myself...). As the number of branches increase the amount of work required to track them and merge between them increases.

So you will always find that there is a delicate trade-off that must be made when you introduce a branch. Branches give you isolation, letting people work independently from each other. There are tons of reasons why this can be helpful, some of which include:

You can work without fear of breaking other people's work
You can try "experimental" stuff which may never actually be finished
You can work on parallel features and release one without releasing the others
You can do new work while still enhancing an old release (ex: service pack releases)

But branches also introduce the need to integrate. If everyone works on the same branch, every check in is an integration. And these happen so frequently it's hard to even think of them as "integration." But if you introduce long lived branches, integration can now be delayed indefinitely. There are plenty of horror stories about companies which tried to delay integration to the very end of application development and then spent almost as long "integrating" as they spent "developing."

This is where Continuous Integration comes from. The idea is that you want to integrate very often (at least once a day according to Fowler), and that you want integration to not just mean merging code, but also running tests to verify that the integration is successful. The point is that you want to catch integration issues early. The reasoning is the earlier you catch them the easier they will be to fix. The longer you wait to fix them, the more things will have diverged, or be built on code that needs to change. So integrate often.

And this is where we arrive at a problem I've been struggling with for quite some time. On the one hand, I want isolation for all the reasons I've already mentioned. And on the other hand I want to integrate continuously to avoid all the pitfalls mentioned. But you can't have both.

Naturally I've tried to look at how other people tend to deal with this issue. There seem to be two big picture approaches:

"Centralized" in which everything starts integrated, and isolation must be purposefully added
"Distributed" in which everything starts isolated, and integration is tightly controlled

All open source projects I've looked at are run very "distributed." Anyone can download and update the code in their own sandbox. But to actually get that code into the project you must submit patches, which the project leaders review and apply if they pass their standards. This works well for Open Source projects because you can't let anybody waltz in and start changing your code. You need control over what code is being added and modified, and you want it to all be reviewed and tested.

In contrast most company projects seem to be run "centralized." Mostly I've found accounts of a main branch with release branches for past releases and maybe a few feature branches. You can work code reviews into a process like this. Some systems support preventing check ins without code reviews. Other people just make it part of the "process." But in general, people are trusted to be making the right changes in the right way, so the control required in OSS projects isn't as strictly needed here. You trust your own employees.

I suppose I should mention quickly that my words "centralized" and "distributed" DO line up with the differences between Distributed and Centralized Source Control Systems (like Mercurial vs. Subversion), but you can still use a Centralized Source Control System and work in a "distributed" fashion by using patches.

There are lots of reasons why you might work one way or the other. But I still have a hard time figuring out what's right for me. This could be because my requirements are unusual. I'm not sure why they would be, but I haven't found other people worrying about them. This makes me think either my requirements are very special, or they're crazy and I'm missing something.

Briefly, here's what I'm dealing with. I have a number of different people working on the same application, but they are all working on un-related or only loosely related features. We have frequent and regular releases of the application being done. We try to target features for the appropriate release, but its incredibly hard to know if a feature will really be "done" on time. Features get pushed back when:

They under go dramatic changes when it reaches completion and users try to put it through its paces
The development takes longer than expected
The developer gets pulled off and placed on another feature
The feature gets put on hold for "business reasons"

If everyone is working on all this stuff in the "main" branch and integrating continuously, what do we do when one feature is not ready to release but the rest are? The changes are all tangled up in the same branch now and there is no easy way to un-tangle them.

The only answer seems to be doing any "non-trivial" work in Feature Branches.

But feature branches pose whole challenges of their own! If you're writing Mercurial, then feature branches are no problem. If you're developing a website with a database then feature branches are a bit harder, because you have to create a new database and fill it with test data. But if you're developing an enterprise application with a complicated database (which can't just be loaded with generated test data) that talks to other databases (with linked servers and service broker) and has a large number of supporting services many of which depend on OTHER 3rd party licensed services; then what? In that case, creating the full application environment for each feature branch is actually impossible. And creating a partial environment is certainly not easy.

But it is precisely because of all of this complexity that we want the isolation! But the complexity makes the isolation difficult (if it's even possible, which I'm not sure of yet) and time consuming. I love paradox!

Perhaps you're saying to yourself, "Maybe you wont be able to actually run the app and do everything it can do in your feature branch, but surely the code is written so all these things that depend on other things are nicely abstracted so that you WILL be able to run the unit tests!" Um, no. Sorry. It pains me to admit it, but I'm afraid that isn't the case just yet.

So where does that leave me? Well, I want to implement continuous integration to reduce the integration issues and make releasing easier, but I also want the flexibility to develop features independently so they don't prevent other features from being released, but my environment is so ridiculous that I can't figure out how to set it up so feature branches are fast and easy to create (and more importantly useful).

So, can you help me? Do you have any of these problems? Have you managed to mitigate any of these problems? Am I missing something?