Thursday, May 29, 2008

Unit Testing Pitfalls

Unit Testing and more importantly TDD is all the rage these days. If you go simply by the noise in the blogosphere, it looks like everyone is doing it. Of course, the reality is probably that almost no one is doing it, but many people think its a good idea, and some of them wish they could do it.

For example, I have written quite a few posts about Unit Testing, TDD, and related topics like Dependency Injection. I have even done some real TDD on the software I write at work. However, most of what I write I don't TDD, and don't Unit Test.

Yes, I'm big enough to admit it. I think its a good idea, and I wish I could do it. But the truth is I don't.

Fortunately, I have reasons. And those reasons are that for all the promise of TDD and Unit Tests, there are a number of Pitfalls.

Almost everything has dependencies, and those need to be mocked/stubbed, and mocking sucks.
Martin Fowler has a great article about Mocks and Stubs if you want to read up. I think that mocking sucks because:
  1. It is a lot of work
  2. It requires intricate knowledge of the internal coding of the thing you're testing
Stubs are slightly better in some ways in that you don't have to go with as much intricate knowledge. But they're more work than mocks, and they can be super complicated, and they still require a good deal of internal coding knowledge.

If mocking sucks, and almost everything needs to be mocked, then almost everything sucks to unit test.

If an interface doesn't exist for your dependency, you have to wrap it.
Say you're testing something that writes to the windows event log. The .NET framework doesn't have an IEventLog interface defined, it just has an EventLog class. So if you want to mock out that dependency with dependency injection, you have to create your own IEventLog. Then you have to create a concrete class that implements IEventLog. Finally, you have to forward every method and property call in the concrete class to the EventLog framework class.

This no fun to write and it's adding complexity and overhead to your code. Just because you want to test.

Note: Using a dynamic language would remove the need for the interface and therefore make this problem go away.

You can't use constructors on your dependencies
Suppose your dependency needs some required information to function and the class you're testing has that information and wants to provide it. Typically you would simply create a constructor on your dependency that took in the required info. Then the class you're testing would new it up and pass in the info. Simple.

You can't do this if you're using Dependency Injection because the dependency must be an interface and interfaces can't have constructors, plus an instance of the concrete class must be passed in to your class's constructor.

To get around this you have to pass in a concrete class that implements the interface. Then you have to send the required info in through properties. Now you have to write the dependency class so that it checks that the required info has been provided before it does anything that requires it. This check will have to go in every public method and possibly some of the properties. Thus the class is more complicated and has more overhead. Just because you want to test.

You can't new-up your dependencies.
Sometimes you may need to new-up a dependency. Then when something changes you may want to new-up a new object to replace the old one. You can't do this if you're using Dependency Injection since you have to pass the dependency in as a fully formed concrete class.

To get around this you're going to have to write the dependency so that it is reusable. Unless you need two instances at the same time. In that case, you'd have to make your dependency into a factory that provided you with instances of your actual dependency. Just because you want to test.

So far, these pitfalls have all been due to Dependency Injection, which like I talked about in an earlier post is powerful, but also kind of scary. We might be able to avoid all this injection of dependencies by using a framework like Typemock, but that's not free, and if I recall right, its not cheap either.

GUIs can't be tested.
It depends on what kind of applications you're writing. For the kinds of apps we work on where I work, the GUIs can be pretty complicated. In fact, usually the GUI is just about all there is to it (aside from retrieving and storing data). We're still writing lots of complicated code which it would be awesome to test, but it's all operating on GUI state.

When people ask me what they can/should Unit Test I always say "Find the algorithm." But when the algorithm is "set this value on this field when the user clicks on this but only when this condition is met, otherwise change the controls which are displayed to this and disable that" you're pretty much out of luck.

Some things aren't worth testing
If all your class does is order calls to other classes and react to errors, your tests are going to be of limited value. Mainly because you're not testing much. It may be 100 lines of code, but it's really not doing much of anything. No algorithms. And any regressions are likely to be because of changes to the dependencies, not because of changes to that class. So is it worth testing this?

I would love to see a book or article on unit testing address these issues. I mean, who is writing code to transfer funds from one account to another and is dealing with simple objects? Who is writing a Queue? Who is writing a web service to serve a music catalog? These examples may teach the concepts and principles of TDD and Unit Testing, but they don't help me to actually practice it. Am I the only one?

Wednesday, May 21, 2008

How Should I's

There are ten types of programmers: Those that understand binary, and those that don't.

Oh, no, that's not what I meant.

There are two types of programmers: Those that ask, "How do I make this go?" and those that ask, "How should this go?"

I like to call the first group High school programmers. These are the kind of people who when presented with a task take their first idea and start trying to make it work. When it doesn't, they just keep tweaking it until it does work. Then it's "done."

I call them High school programmers because this is what High school students do when presented with an error message. "Oh, that's weird. Well, I'll just change this over here and try it again. That didn't do it? Ok, what about if I do this? Still no? Well what about..." They just care that it works in the end, they don't really care how it works or why it works or why it didn't work in the first place.

The second group are the ones that not only want their code to work, but want it to work the best it can. These people are probably going to consider alternative implementation approaches, designs, and architectures. They're probably going to refactor their code to make sure it's as clean and efficient as possible. They may even go so far as trying different things before making up their mind (prototyping, if you will).

This distinction actually is important. You obviously want the How Should I's working with you and not the How Do I's. Simply because their work will be better: cleaner, more maintainable, more bug free, more extensible, and believe or not finished faster.

Its important because this is the quality you're actually looking for. I've read many blogs where people claim that you can't be a good programmer and leave work at 5pm. This is simply ridiculous. Certainly your How Should I's are likely to be more obsessive, and therefore more likely to get caught up and work longer. But there is absolutely no reason why a How Should I can't have a life outside of work, leave at 5pm, and still be a great developer.

I also think that being a How Should I is a necessary condition to qualifying as Smart and Gets Things Done.

And on top of that, being a How Should I is very likely to also make you a top 20%-er.

You can also see this quality in The Pragmatic Programmer's definition of a Pragmatic Programmer,
Tip 1: Care About Your Craft
Tip 2: Think! About Your Work

So if you're interviewing, or reviewing other people's work, or simply working with other developers, this is a quality you should look for and appreciate.

Friday, May 9, 2008

The Agile Customer

One of the main concepts of "Agile Development Methodologies" is that the customer be involved in the entire process. And not only that they are involved, but that they are actually there, alongside the developers, while the software is written.

This is, of course, ridiculous. Most customers don’t actually have that kind of time to spare. So in reality the word "customer" is used to mean "customer representative." That is, someone who should be sufficiently familiar with the real customer to be able to understand their needs and argue for them with the development team.

This concept is a primary ingredient in fulfilling the purpose of an Agile Methodology. That purpose, in my view, can be stated as follows:

Requirements will change and be wrong. We must be able to create excellent software despite those changes.

This is accomplished by designing and developing things in small pieces (in iterations) and by frequently getting those pieces to the "Agile Customer" where we can verify that they work out, and when they don’t work out, we make the required changes immediately.

That, at least, is the idea. But I don’t think it works, and I think the reason it doesn’t work is because this concept of an Agile Customer is a bit unreal. Here’s why:
  1. Your client doesn't have time to be that involved in the development process, and even if they did, they wouldn’t care
  2. The business process is more complicated than a single person can fully understand, even though they think they understand it
  3. A "customer representative" will know even less than the customer
  4. No one is going to get it all right the first time, and they wont notice its wrong until they try to really use it
  5. In a software demo the client will find about 80% of the trivial problems with the software, but only 2% of the big stuff (missing pieces, incorrect process, bad assumptions, etc)
  6. In a testing environment, the client will find about 80% of the bugs in your software, but only 10% of the big stuff.
Basically, we expect an Agile Customer (who is not just a single person of course...) to know and remember every little detail and to recognize when details are wrong and also when they are missing. Unfortunately, real live people are really bad when it comes to these kinds of details.

The result is your customer representative won’t know all the things they need to know (because it’s impossible for them to). Your client will love your demos but won’t notice all the things about their business that aren’t accounted for (because they didn’t tell you about them because they probably don’t even remember themselves). Your pre-release testing won’t uncover the problems because it only begins to scrape the surface (because it would take too much time and effort to really use the system and not just test the system).

What’s a poor agile methodology to do?

I don’t think you should stop trying to use the “Agile Customer” approach. But you should realize from the beginning that it isn’t going to get you to 100% wonderful working software.

To do that, you’re going to actually have to release it. Often. Even when it doesn’t quite do everything it needs to do yet. Then you’ll see if it’s 100% wonderful and working.

This means you are going to have to transition people to using it, and you’re going to have to train people on it. It’s going to take time and effort. But if you do it correctly, you can save yourself time overall by finding problems earlier and fixing them before it’s too late.

The obvious problem with this is that your users will hate you. You are giving them unfinished software and telling them it’s done. You are making them deal with frequent releases, which means frequent changes in the software, which means they have to keep relearning things you changed, as well as learning new things you’ve added.

However, we can mitigate this by setting the tone.

First off, DON’T tell them it’s done! Even if you think it’s done, it’s not done. So tell them it’s solid, tell them its tested, but tell them that you and they need to work together to ferret out anything that may have been missed, or misunderstood, or changed.

Then as problems are found, don’t get annoyed or frustrated. Instead, remember that you were expecting it, and frame it as such. Every problem you find makes the software that much better. So tell the client that, and then keep them up to date on how much better the software is becoming. Show them lists of what was found, when, and when it got fixed and released. If you’re proud of how quickly you’re iterating and improving you can trick them into being psyched about it too, I mean, make them feel psyched about it too.

Then as critical parts of the software “settle down” in release, you can start adding other parts to them. You’re going to have to plan these “parts” intelligently of course. You can’t release the order intake portion of your software without the portion that allows them to view, manage, and/or print the orders… So your parts need to make sense. But you can release the majority of the order stuff and leave out some of the next steps, or additional features. You may find that you don’t need that stuff at all. Or that you had it all wrong. Or, if you’re lucky, that you had it perfect, and then you can go ahead and develop it and release it. Then you’ll discover you had it wrong. But, hey, at least you tried. And at least you got the really important main stuff out the door earlier.

Basically, you’re not doing anything differently than you would have done it before. You’re just doing it with a different expectation and a different attitude. The result will hopefully be rock solid software, happy clients, and best of all, happy developers.

Now we come to the part of the article where I fess up to the fact that while I wrote all this as if I’d been doing it for years and it works perfectly, I actually haven’t had an opportunity to actually try to follow this advice. Instead, this is more the result of the problems I’ve observed and the best ideas for how to solve them I’ve been able to come up with. Thus, if you’ve had different observations, or different ideas for solutions, or if you just completely disagree with me and think I’m a stupid moron who speaks out of turn about things he doesn’t know anything about, I want to hear about it. And that’s what the comments are for.

Monday, May 5, 2008

Dependency Inversion

Dependency Inversion is a design principle which states:
A. High level modules should not depend upon low level modules. Both should depend upon abstractions.
B. Abstractions should not depend upon details. Details should depend upon abstractions.

What in the world is that supposed to mean? Basically, the problem Dependency Inversion is trying to solve is the problem of tightly coupled layers. For example, if you're working with a database you might have tables, and stored procedures, and classes to execute the sps, and classes to use the sp classes, and finally a GUI. If you make a change to a table, you frequently have to change every single layer above that, all the way to the GUI... There are many changes for which this is inevitable. But there are some changes for which this could be avoided. Dependency Inversion is a principle for helping to avoid this kinds of water falling changes by isolating the dependencies between your layers.

Here's a picture:
It's "Dependency Inversion" because we've flipped the direction of the arrow from "high to low " to "low to high". The high layer still uses the low layer, but now the high layer is defining what it expects from the low layer, and the low layer is simply fulfilling those expectations. You can completely rewrite the low layer and as long as it still meets the high level's needs the high level need not change.

There's always a down side, so where's the down side here? The downside is that since the interface has been defined at the high level, the low level depends on the high level! What if we want to reuse the low level somewhere else? If this is a statically typed language like C#, we'd have to reference the high level's dll, even though we weren't going to use it, just because it contained the interface definition.

I don't write much in dynamic languages, so I might be mistaken, but in a dynamic language you could accomplish this same pattern without the interface declaration and without the low level depending on the higher level. Less boiler plate + less dependency definition = more flexible code base. As I understand it, that's the main appeal of dynamic languages.

But back in the statically typed world, how do we use dependency inversion AND manage to reuse the lower layer?

No really, I'm actually asking. It's not a rhetorical question. Hopefully you'll respond in the comments.

One possible way would be to make the low layer define it's own interface, then use the Adapter/Proxy design pattern to create a a class that implements the high layer's interface but uses the low layer's interface... This is clearly less than ideal though.