kwblog: design

Showing posts with label design. Show all posts

Thursday, April 10, 2014

OOP: You're Doing It Completely Wrong

OOP: You're Doing It Completely Wrong (Stir Trek Edition)

This talk, "OOP: You're Doing It Completely Wrong", was first presented at CodeMash 2.0.1.4. in January to a standing room only crowd. It was the 2nd time I was fortunate enough to present at CodeMash and it was an absolute blast! The feedback I got was really encouraging and someone suggested I should submit it to other conferences and recommended Stir Trek. The video above is the recording of the talk as it was given at Stir Trek 2014 in April (to another standing room only crowd!).

Here's the abstract:

Chances are, most of us are primarily writing in Object Oriented Languages. But how many of us are truly doing Object Oriented Programming (OOP)? Objects are a powerful abstraction, but when all we do is write procedural code wrapped in classes we’re not realizing their benefits. That’s the tricky thing about OO, it’s easy to have Objects but still not be doing good OOP. This has led to a plethora of principles and patterns and laws, which are very valuable, but also easy to misunderstand and misapply. In this talk we’ll go back to the foundations of Objects, and take a careful look at what OO is really about and how our principles and patterns fit into the big picture. We’ll see why good OOP is important, and look at the mindset needed to design successful Objects. When we’re done, we’ll have a more nuanced understanding of what good OO is, what it can do for us, and when we should use it.

Between CodeMash and Stir Trek I had the time to really work through and reorganize the details of the talk, so I actually COMPLETELY rewrote it from the ground up for Stir Trek. And I will admit that I'm really proud of the result.

This talk truly represents my (current) understanding of what makes OO powerful and how we should really think about it. It's VERY heavy on research and full of quotes and references. The CodeMash version was even more so. And that reflects my belief that we, as an industry, need to work on being a bit more scientific, especially when it comes to citing our references.

I hope that you enjoy it, and I'd love to hear your thoughts!

Monday, February 11, 2013

POODR's Duck Types

I recently read Practical Object-Oriented Design in Ruby by Sandi Metz. It's a really wonderful book. I can say without any hesitation it has made me a much better Object-Oriented programmer. I honestly wish I could have read this 12 years ago when I was first learning an Object-Oriented language.

Although the book is totally focused on Ruby, the OO practices it presents are easily applicable to other OO languages, including static languages like C#. This makes it one of those timeless books you can be happy to have on your shelf knowing it's not going to be outdated in a year. I highly recommend it!

I intend to write a few posts highlighting some of the good ideas that struck me the most from this book, but in this post I just can't help but take a few shots at it's treatment of static vs. dynamic languages.

My programming language lineage started by dabbling in C, then taking classes in C++, followed by Java, and finally C#. Most of the real world code I've written has been in static languages, and I've been programming professionally in C# for the last 8 years. This makes me a static language guy.

When I learned ruby, about 5 years ago, I fell in love with it's clean syntax and amazing flexibility. I wrote a few simple tools in it for work, and I've written alot of rspec/capybara tests, plus I dabbled a bit with Rails. I feel I have a decent understanding of the language, but I'm by no means an expert and I definitely still think in classes and types.

I tell you this to explain where I'm coming from. Static languages are what I know best and are what I'm used to. I'm not a dynamic language hater, I'm just comfortable with static langs. Which brings us back to POODR.

POODR talks a lot of about "Duck Types" which are defined in the book as:

Duck types are public interfaces that are not tied to any specific class. These across-class interfaces add enormous flexibility to your application by replacing costly dependencies on class with more forgiving dependencies on messages.

I was surprised at this definition because it describes the "Duck type" as being a thing, but in Ruby there is no thing that can represent this across-class-interface. Most treatments of duck typing from Rubyists I've seen usually just talk about how it's a feature of the dynamic nature of the language. They talk about "duck typing" but not "duck types."

In C# we have interfaces, which can be used as explicitly defined duck types. The Dependency Inversion Principle and the Interface Segregation Principle are both trying to get you to use interfaces in this way, instead of just as Header Interfaces. It's good OO because it focuses on messages instead of types. As POODR says, "It's not what an object is that matters, it's what it does."

I think there is a lot of power in Ruby's implicit "duck types," but I also think the lack of explicit interfaces is a serious liability, and I was very entertained by how many hoops POODR jumps through to try to work around this problem, all while trying to claim that it isn't a problem at all, and in fact, it's great!

At the end of Chapter 5, there's a section that tries to convince you that Dynamic typing is better than Static typing. Unfortunately, it just builds up a straw man version of static typing to make it easier to tear down. What it leaves out is interfaces:

Duck typing provides a way out of this trap. It removes the dependencies on class and thus avoids the subsequent type failures. It reveals stable abstractions on which your code can safely depend.

If statics langs didn't have interfaces, this might be true. But they do have interfaces! And worse, interfaces represent a significantly more stable abstraction that is dramatically safter to depend on than these invisible "duck types." POODR demonstrates this itself with examples where the "duck type" interface changes, but not all "implementers" of the interface are updated. There's no compiler to catch this. And standard TDD practices wont catch it either. Your tests will be green even though the system doesn't work. So you have to write manual tests that you can share across all the implementers to make sure the message names and parameters stay in sync. Nearly all of Chapter 9 is devoted to testing practices that simply wouldn't be needed if there was even just a rudimentary compiler that could verify just inheritance and interface implementations.

The lack of explicit "duck types" just seems so problematic to me... Keeping them in sync is a chore, and a potential source of error. The worst kind of error too, because the same code may work in one context but break in another based on which "duck type" is used.

Another problem I've run into is when trying to understand some code that takes in a "duck type", how do you figure out the full story of what will happen? How do you find all the implementers of that "duck type"? Just search your code base for one of the method names? Try to find every line of code that injects in a different duck type?

Not being able to surface an explicit interface leaves you stuck in a situation where you have to infer the relationship between your objects by finding every usage of them. Seems like a lot more work, as well as being a recipe for tangled and confusing code.

So what do you think Dynamic language people? Am I making a bigger deal out of the problems of dynamic typing just as Sandi made a bigger deal out of the problems of static typing? Is this just a lack of experience problem? Do you just not run into these issues that often in real world usage?

UPDATE 2/20/2013:
Here's an interesting presentation by Michael Feathers about the power of thinking about types during design. I felt like it had some relevance to the conversation here.

Friday, December 21, 2012

Slicing Concerns: Implementations

In Slicing Concerns And Naming Them I posed a question about how to go about separating different concerns while still maintaining a clean and relatable code base. Some interesting conversation resulted, and I wanted to follow up by investigating some of the different approaches to this problem that I'm aware of.

Inheritance

public class Task : ActiveRecord
{
  public string Name { get; set; }
  public int AssignedTo_UserId { get; set; }
  public DateTime DueOn { get; set; }
}

public class NotificationTask : Task
{
  public override void Save()
  {
    bool isNew = IsNewRecord;
    base.Save();
    if (isNew)
      Email.Send(...);
  }
}

public class TasksController : Controller
{
  public ActionResult Create(...)
  {
    ...
    new NotificationTask {...}.Save();
    ...
  }

  public ActionResult CreateWithNoEmail(...)
  {
    ...
    new Task {...}.Save();
    ...
  }
}

This works, and the names are reasonable. But of course, inheritance can cause problems... I wont go into the composition over inheritance arguments as I assume this isn't the first time you've heard it!

Decorator

public class Task : ActiveRecord
{
  public string Name { get; set; }
  public int AssignedTo_UserId { get; set; }
  public DateTime DueOn { get; set; }
}

public class NotificationTask
{
  Task task;

  public NotificationTask(Task t)
  {
    this.task = t;
  }

  public void Save()
  {
    bool isNew = t.IsNewRecord;
    t.Save();
    if (isNew)
      Email.Send(...);
  }
}

public class TasksController : Controller
{
  public ActionResult CreateTask()
  {
    ...
    new NotificationTask(new Task {...}).Save();
    ...
  }
}

This is not really the decorator pattern... At least not as defined by the GoF, but I have seen it used this way often enough that I don't feed too terrible calling it that. Really this is just a wrapper class. It's similar to the inheritance approach, except because it doesn't use inheritance, it opens us up to use inheritance on the Task for other reasons, and apply the email behavior to any kind of task.

The naming is a bit suspect, because NotificationTask is not really a task, it just has a task. It implements only one of the task's methods. If we extracted an ITask interface we could make NotificationTask implement it and just forward all the calls. This would make it a task (and a decorator), but would also be crazy tedious.

Service

public class Task : ActiveRecord
{
  public string Name { get; set; }
  public int AssignedTo_UserId { get; set; }
  public DateTime DueOn { get; set; }
}

public class CreatesTask
{
  Task task;

  public NotificationTask(Task t)
  {
    this.task = t;
  }

  public void Create()
  {
    t.Save();
    Email.Send(...);
  }
}

This service represents the standard domain behavior for creating a task. In an edge case where you needed a task but didn't want the email, you would just not use the service.

The naming is pretty nice here, hard to be confused about what CreatesTask does... However, this path leads to a proliferation of <verb><noun> classes. In the small it's manageable, but as they accumulate, or as they start to call each other things get confusing. For example, if you know nothing about Task and you have to start working on it, would you know you should call the CreatesTask service? Would you know it exists? And would you be sure it was the correct service for you to be calling?

Dependency Injection

public class Task : ActiveRecord
{
  public string Name { get; set; }
  public int AssignedTo_UserId { get; set; }
  public DateTime DueOn { get; set; }

  INotifier notifier;

  public Task(INotifier notifier)
  {
    this.notifier = notifier;
  }

  public override void Save()
  {
    bool isNew = t.IsNewRecord;
    t.Save();
    if (isNew)
      notifier.Send(...);
  }
}

public class TasksController : Controller
{
  public ActionResult Create(...)
  {
    ...
    new Task(new EmailNotifier()) { ... }.Save();
    ...
  }

  public ActionResult CreateWithNoEmail(...)
  {
    ...
    new Task(new NullNotifier()) { ... }.Save();
    ...
  }
}

I'm going to ignore all the complexity around the fact that this is an ActiveRecord object which the ActiveRecord framework will usually be responsible for new-ing up, which makes providing DI dependencies difficult if not impossible...

The idea here is to pass in an INotifier, and then when you find yourself dealing with a task you'll build it with the notifier you want it to use. If you want no notification, you use the Null Object pattern and pass in an INotifier that doesn't do anything (called NullNotifier in the code example).

But this has the ORM-framework draw back I mentioned above. Plus it requires the code that is constructing the task to know what behavior the code that is going to save the task will require. Most of the time that's probably the same code, but if they aren't, you're out of luck.

Operational vs Data Classes

public class TaskInfo
{
  public string Name { get; set; }
  public int AssignedTo_UserId { get; set; }
  public DateTime DueOn { get; set; }
}

public class TaskList
{
  public TaskInfo Create(TaskInfo t)
  {
    t.Save();
    notifier.Send(...);
    return t;
  }
}

Here I've separated the data class from the operational class. I talked about this in the Stratified Design series of posts. This separation hides ActiveRecord, giving us the control to define all of our operations independently of the database operations they may require. If we needed to save a task without sending an email we could just call TaskInfo.Save() directly from whatever mythical operation had that requirement. Or we could do some extract method refactorings on the Task.Create method to expose methods with just the behavior we need. Or we might extract another class. Naming is going to be hard for these refactorings, but at least we have options.

If I missed anything, or if you see an important variation I didn't think of, please tell me about it! As always you can talk to me on twitter, and you can still fork the original gist.

Monday, December 17, 2012

Slicing Concerns, And Naming Them

Naming is hard. Especially in OO. To name something, you have to understand it at it's deepest level. You must capture it's true essence. This is hard when you're giving a name to a thing that already exists, but it's orders of magnitude harder when you're simultaneously creating the thing out of thin air, and trying to decide what to call it. Which is after all what we do when we're designing code.

The "essence of things" correlates closely with concepts like Separation of Concerns and the Single Responsibility Principle. You can slice any object into ever smaller concerns or responsibilities. You can slice it right down to it's constituent atoms! Many design problems, like tight coupling and loss of flexibility, are in large part due to having concerns and responsibilities defined at too high a level. Could this be so common simply because it's so hard to find names for the smaller concepts? It's frequently easy to see what those separate concepts may be, but terribly hard to think what to name them!

Let's have an example:
This is entirely fictional code, but it's not so different from a lot of real code I've seen in the wild. And it illustrates this problem of slicing concerns very well.

At first glance, it seems very simple. The domain has a Task concept which has a default due date (set in the constructor), and which sends a notification email after it's inserted (using an ActiveRecord hook). This very nicely and completely describes what a task is and how it behaves in our system. And the names make it very intuitive.

Or do they? Is it really the case that every single time we insert a task in the database it should send an email? Unlikely. We should slice that behavior out and put it somewhere else:

public class _WhatShouldThisBeCalled_
{
  public class _WhatShouldThisBeCalled_(Task t)
  {
    t.Save();
    Email.Send(t.AssignedTo_UserId, "New Task", "You have been assigned a new task");
  }
}

This is an incredibly simple refactoring, but I have no idea what this class should be called. The method is a bit easier, it could be InsertAndNotify(Task t) or something similar. But what is this class? What concern does it represent?

No, really, I'm actually asking you. What would you call it?

Or how else would you write it? Maybe you'd do something like a fire an event and have someone hook it? How would they hook it? Maybe we need an EventAggregator? This is getting awfully complex for such a simple requirement!

And it's not done, because it's not really so great that it defaults the DueOn date in the constructor. Is every single task really due tomorrow? Or is it just a certain kind of task, or tasks created in a certain way? And where will we put that code, what will it be called?

I sincerely believe this is both a significant design problem, and a significant naming problem. I want to know how you'd tackle it. Please do leave a comment or tell me on twitter or even better, fork the gist on github!

These concerns need to be separate! But what a cost we pay for it! The simple OO domain model of a Task has turned into something much less relatable. Either it's event driven spaghetti code with strange infrastructure objects like EventAggregators. Or it's a hodge-podge of service or command classes, none of which actually model a relatable thing... They only model functions, features, behaviors, use cases. Or maybe we try applying inheritance, and then we end up in a whole different world of confusing names and surprising behaviors.

Can't we do better? Is there some way we can do the slicing of concerns we need but still maintain the modeling of real relatable things? Even if that may require a different way of thinking or not using the design patterns that led us here (Active Record, in this case).

Friday, December 14, 2012

The Fundamental Software Design Problem

The most fundamental software design problem, that this the the most important problem which underlies all design decisions, is:

Choosing the right amount of abstraction

Say you're starting a brand new project that you don't have any previous experience with. What sort of architecture should we apply? We have a lot of choices, some listed here, ordered in increasing complexity:

SmartUI
MVC w/ Active Record
Ports and Adapters
SOA
CQRS

For some problems just a glance is enough to know it needs a more abstract and complex solution. Equivalently, some problems quite clearly should be as simple as possible. But most problems lie somewhere in between. And generally there's really no way up front to know exactly where on the complexity scale it will lie.

Worse still, in a large enough application different portions of the application might be more or less complex. Some areas could be simple crud with no logic, while other areas involve heavy data processing and complex workflow and queries.

And even worser, this is a moving target. If I had a dollar for every time something I thought was pretty straight forward became much more complicated either because of changing requirements, scope creep, or just misunderstanding... Well, I'd have quite a few dollars!

As I see it, there are basically two strategies for dealing with this problem:

Start as simple as you possibly can, and evolve to more complicated designs as things change
Start slightly more complex than may be strictly necessary so that it's easier to make changes later

I would expect people from the Agile and Lean communities to balk at the very mention of this question. They'd probably bring up stuff like YAGNI and evolutionary design. And I agree with this stuff, I agree with it completely!

But I also think boiling frog syndrome is a real thing. Even a great team with the best intentions can easily find themselves stuck in the middle of a big ball of mud. That's just life. Little things change, one little thing at a time, and you do "the simplest thing that could possibly work" because hey, ya ain't gonna need to do a big overhaul now, this will probably be the last tweak. And next thing you know, everything is a tangled mess and all your flexibility is gone!

To add insult to injury, when you find yourself wanting to do a significant refactoring to a more abstract design, it's frequently your unit tests that are the primary problem spot holding you back. Those same tests that were so useful when you were building the code in the first place are suddenly locking you into your ball of mud.

I can hear you now. You're looking down your nose at me. Huffing and puffing that if I'd had more experience it never would have come to this! If I'd just listened to my tests, the ball of mud wouldn't have happened. If I'd just understood the right way to build software! blah blah blah. Sorry, I don't care. I build real software for real people with a real team, I'm not interested in idealism and fairy tales. I'm interested in practical results! I'm interesting in making the correct compromises to yield the best results while constantly striving to do better!

And that's ultimately my point! No matter what design I start out with, I want it to allow me to strive to do better. If the simplest thing that could possibly work is going to be hard to evolve into something more flexible, that's a problem. Accounting for change doesn't necessarily mean doing the simplest thing, in some cases it means doing something a little more complicated, a little more abstract, a little more decoupled, or a little more communicative.

If this ticks you off, please come argue with me on twitter!

Monday, December 3, 2012

Stratified Design

The last posted ended by presenting a style of OO in which the objects only exposed operations which communicated via data classes. We arrived at this design by thinking more deeply about encapsulation. I asserted that there were a number of benefits to an object structured this way, but promised to also talk about the architectural benefits of applying this practice throughout your code.

A Stratified Design means writing all of our objects in this behavior-only style, and passing data classes between them. There are lots of detailed decisions to be made around what exactly those data classes should contain, but for now I'm going to stay at a higher level.

You may be thinking to yourself, "Hey! Those are just layers! I've been duped, there's nothing novel or interesting in this except a fancier name!" While what I'm advocating here is very similar to the traditional layered architecture there are some very critical differences.

Some definitions of layers include a restriction that lower layers may not reference higher layers. When applied to domain modeling this tends to lead to ridiculous restrictions where your data layer is not allowed to return domain objects because the domain is a higher level concern than the database. A stratified design agrees that lower stratum should not use behaviors of higher stratum, but that doesn't mean they can't share the same data classes!
Layered architectures usually prescribe an exact number of layers for specific purposes. In a stratified architecture, there aren't any consistent named layers. Instead, there's just a series of classes calling into each other as needed. Each of those classes is defined at some level of abstraction, and calls into it's dependent layers as needed. And that's as prescriptive as it gets.

There are a lot of awesome things that follow from having objects calling other objects in this way:

Decoupled: clean interfaces communicating with data is about as decoupled as you can get (and therefore insanely easy to unit test!)
Simple, not complected: each object knows only about the interface of it's dependencies. And it accepts small data classes, and outputs small data classes. There's no static or global knowledge, no god objects. And each object represents one concept defined at a consistent level of abstraction.
Behavior only where needed: the simplest example is you will only be passing small data classes into your view, not ActiveRecord objects (which expose query and data persistence behavior).
Somewhat side-effect free: not entirely side-effect free, but because there is little to no shared state, it's difficult to be surprised by a side effect.
Intuitive: If you do a good job separating your levels of abstraction, you will find that when you are looking for something, it's always right where you expect it. Or if it not right there it's one explicit function call away. Contrast this with OO designs that are riddled with inheritance and misapplied strategy and state patterns... Or compare it to "light weight" Active Record based designs where logic might be in an AR hook, or might be in a service class, or might be in a controller...

The inspiration for this Stratified Design came from a number of different sources. But the primary ones were:

Rich Hickey's Simple Made Easy and The Value of Values
Bob Martin's Architecture the Lost Years
David West's book Object Thinking (which I reviewed before)

These don't spell this out exactly, but they contributed certain concepts. And as always, remember that "architecture" is dangerous. Lots of people might be excited about CQRS, but that doesn't mean it should be applied to a mostly read only content management site. And Rails might be an efficient platform for quickly building a web app, but that doesn't mean it's right for building a space shuttle control panel. And the same goes for Stratified Design. The architecture of your application should reflect the nature of your application.

But that said, if you give Stratified Design (or something similar) a try, I'd love to hear about your experiences with it!

Wednesday, November 14, 2012

Encapsulation: You're doing it wrong

In the last post, I investigated just what the devil encapsulation actually is. I may not have answered that question, but I did decide that whatever it means, there's a subtle but important distinction to be made around encapsulating "data". The example that launched that distinction was a Queue which stores data from the caller in some encapsulated implementing data structure. Notice the distinction between the caller's data, and the Queue's implementation.

One way of approaching a new OO design in a "business" environment is to ask, what data do I have? Then create a "model", and add a property for each data element. C#'s { get; set; } properties highly encourage this, and ORM and ActiveRecord tools require it. So now we have little data classes, structures basically. But we know that we're not doing OO unless we're doing encapsulation, and that means we need some methods! So we add some methods to our little data classes that usually either modify that data in some way, or perform some calculations with it.

But what is this class encapsulating? All the data is fully exposed, and the methods are restricted to simple operations on the same data. Clearly it's trying to represent something, but we started from some data which more than likely corresponds directly to a database table. So what is it representing? At best, one data thing. And what is it encapsulating? Some logic about that data.

But looking at this again from the perspective of encapsulation as bundling implementation details instead of data, we could go a different route. When thinking about a Queue, I don't think about it's internal implementing data structure. I think about the operations I want it to perform for me. So instead of asking, "what data do I have?", "what operations do I need to perform?" could be better starting point.

What if all the properties were moved off the object onto their own little class -- or structure -- or in F#, record. The original object would then be left with operations only. And one of those operations would have to be getting the data, and that would just be a simple method that returned the little data class/structure/record. This class is encapsulating the implementation details of those operations you decided you needed to perform! And just the like the Queue, there is now a clear distinction between the caller's data and the class's implementation.

A number of interesting benefits follow from this:

Enables a coarser grained interface, which is especially useful for data access. You gain the control to define operations to retrieve as little or as much data as you need.
Designing around encapsulating implementation details leads to objects that are well defined with intuitive behaviors and clear purpose. Ultimately that means it's easier to find the behavior you want, and extend behavior when needed.
The resulting clean behavioral interface, passing and returning data, immediately results in simple and flexible decoupling, which is great for unit testing.

And these are just the benefits realized at the level of just the one class we modified. In the next post I want to look at what happens when this architecture is applied through out in what I call a stratified design.

Tuesday, November 13, 2012

Encapsulation: What the devil is it?

I love the word 'Encapsulation.' It's a big fancy word and I feel smart when I use it. Unfortunately, I'm not really sure what it means, and neither is Wikipedia. "Encapsulation is to hide the variables or something inside a class." I lol'd when I read "or something," what a specific definition! So, what the devil is it?

The most naive OO definition might be:

A language feature that bundles data and methods together.

You might extend that to say that it hides the data from public consumption, but that part muddies up the water, as the Wikipedia article demonstrates. My favorite example of Encapsulation is a Queue class. You get push, pop, and peek operations to call, but you don't know what data structure the uses to implement those operations. It could be an array, it could be a linked list, whatever. In this we can easily see the beauty and the power of encapsulation: "data" and "methods" together.

But wait, what did I actually encapsulate in that queue? Was it the "data"? I pass my data into and out of the Queue, and the Queue hides its implementing data structure from me. Maybe it's a Queue<string>, and I'm all: q.Push("encapsulation"); q.Push("is"); q.Push("about"); q.Push("data"); Assert.AreEqual("encapsulation", q.Pop()); For me the data are those strings, but those strings are clearly not what the Queue is encapsulating!

That word "data" in our definition is a tricky one. It can be applied to too many things to be really useful. But does replacing "data" with "data structure" in the definition fix the problem?

A language feature that bundles data structures and methods together.

It clears it up, but it introduces another problem. For example, what if my object is a database gateway? Certainly there's a data structure somewhere in that database, but my object isn't directly encapsulating that! No, it's probably "encapsulating" ADO.NET procedural calls, or some other data access library. The procedural calls are neither data nor data structure... So could it be that thinking about data is completely misleading?

A language feature that bundles implementation details and methods together.

This is a rather large step though! Instead of just talking about data, or data structures, this now includes just about anything in the definition of things that can be bundled with methods to achieve encapsulation! Maybe there is some value in restricting what the word "encapsulation" applies to, but if there is, I doubt it's something that is going to prove useful for Software Engineers. So while I admit this definition could be a perversion of the word "encapsulation," I find it more useful.

The other definition Wikipedia gives for encapsulation, which I've neglected to tell you about until now, is "A language mechanism for restricting access to some of the object's components." This is more similar to the definition I just ended on. I take some issue with the word "restricting" and the word "components" is ambiguous enough to be a problem. But I don't think it's a stretch to think "components" could include both data and dependencies.

So, perhaps we've arrived at a better understanding of encapsulation. One that recognizes that data is not the all important concept. The next step I'd like to take is to extend this slightly deeper understanding outside the realm of data structures and into more typical "business" scenarios. That will be the next post.

Wednesday, June 20, 2012

Don't trust your instincts

Ours is a young discipline with no rules or laws except the ones we choose for ourselves. We are still living in the wild wild west of software development. There is no exam to pass to become a software developer. There are no standard evaluation procedures, or checks and balances. If you can make your software execute, you can install it, host it, and sell it. This is one of the things I find very attractive about our industry, but also occasionally frustrating.

There have been many attempts to standardize engineering in the form of processes and methodologies. But process is only a small -- and very uninteresting -- portion of building software. Code itself is far more challenging, interesting, and diverse. But it is very lacking in recognized rules or techniques or principles or even just ideas. Certainly there are some code principles and practices, but how successful or accepted are they?

It's hard to get a true sense of the opinions and practices of our industry, but there is clearly a very vocal minority that eschews "software engineering" practices in favor of a loosely defined aesthetic. I'll use "software engineering" as a label for structured principles, patterns, and practices. For example, consider the Gang of Four's design patterns, or Bob Martin's SOLID principles. But the vocal minority, which seems to me at least to be getting increasingly vocal these days, would argue these concepts (patterns and principles) are more harmful than helpful. That a better approach is to simply take the time to feel the pain in your code, and adjust, rewrite, and refactor as needed.

A really solid example of this argument being made can be heard in this Ruby Rogues podcast interview of DHH. If you stick with it, the conversation covers a lot of really interesting topics including how DHH applies this thinking to rails and basecamp, YAGNI, thoughts on education and the necessity of stubbing your toe to learn, and more (Thanks to Lee Muro for referring me to that podcast).

I agree that stubbing your toe is a good teacher, but I don't think it's the only way to learn. I agree that abstract concepts are easy to over use and misapply, especially after first learning about them, but I don't think that's inevitable. While I find the refactoring and continuous learning part of this attitude very pragmatic, there is one element I do disagree with: the idea that we don't need abstract rules and principles and guidance and science. That all we need is our sense of aesthetic. The idea that by simply looking at some code, maybe comparing it to a different version, you can derive an intuitive understanding of which code is better.

I don't buy this, because I don't think that's how humans work, as outlined by Malcolm Gladwell's book Blink and this article by Jonah Lehrer. I recommend them both, but if you're short on time, just read the Jonah Lehrer article as it's short and the most directly relevant.

Blink is all about the influence our subconscious mind has on us. We like to think that we are rational and in full conscious control of what we do and what we think. But Blink has plenty of research to prove that this simply is not the case. We depend on our subconscious to make snap decisions and influence our general mood and thoughts much more than we realize. And Blink goes to great lengths to present the fact that this can be both very powerful and harmful. Your mind is capable of "thin slicing" a situation, pulling out many relevant factors from all the thousands of details, and coming to a conclusion based on those details. But, not surprisingly, you need both extensive practice AND exposure to all the needed factors for this to work. And it's worth mentioning that even when it does work, your conscious mind may never understand what it was your unconscious did to come to it's conclusion!

You might read that and think, "Experts can use their unconscious to recognize good and bad code, the vocal minority is right!" I believe that is true, but only on a local level. When you look at code, you are always drilled in to the lowest level. I think you could intuit a fair amount at this level, but it's the higher concepts that have the larger influence, and I'm not sure you can effectively thin slice that stuff. Many of the concepts of good architecture are about higher level structure: low coupling, high cohesion, SRP, ISP, DRY. But if I showed you one code file and asked you to tell me if the application suffered from coupling issues, you wouldn't be able to say. And that's because you haven't been provided with enough information. And without that information, how can you possible thin slice your way to an intuitive understanding of good code? I worry that a focus on "aesthetic" and "elegance" leans too heavily on this intuitive feel for code, and carries a serious risk of leading you down a path that feels smooth and easy, but ultimately leads straight into the woods.

But I would take this argument even further. Jonah Lehrer's article tells a story of a psychology experiment that went something like this. Study participants were shown two videos, both showed two different sized balls, one larger than the other, falling toward the ground. In one video the balls hit the ground at the same time, and in the other the larger ball hit the ground first. The participants were asked which video was a more accurate representation of gravity.

And the answer is: the video where they hit the ground at the same time is the correct one. This is not intuitive, most of us would expect the larger ball to hit first. So the way the world actually works comes as quite a surprise. But where this gets interesting is in the second part of the study. This time, the participants were all physics majors, who had studied this and learned the correct answer. The participants brains were being monitored with an fMRI machine and what the researchers discovered is that in the non-physics majors a certain part of the brain was lighting up which is associated with detecting errors, the "Oh-shit! circuit" as Jonah calls it. When they saw the video of the balls hitting the ground at the same time, their brains raised the bull shit flag. So what was different with the physics majors that allowed them to get the right answer?

But it turned out that something interesting was happening inside their brains that allowed them to hold this belief. When they saw the scientifically correct video, blood flow increased to a part of the brain called the dorsolateral prefrontal cortex, or D.L.P.F.C. The D.L.P.F.C. is located just behind the forehead and is one of the last brain areas to develop in young adults. It plays a crucial role in suppressing so-called unwanted representations, getting rid of those thoughts that aren’t helpful or useful.

This other section of the brain allows us to override our intuitive primal expectations, the Oh-shit! circuit, and replace them with learned ones. But in order for this circuit to work, you must have studied and learned the material! Which requires that there be something to learn!

The connection to the aesthetic instinctive approach to software should be pretty clear. If you shun what "science" our industry has to offer, however admittedly weak and young it may be, you're not training your brain to suppress the intuitive but worse-for-you-in-the-end code!

So I think it's important to be cautious when relying on your intuition and sense of aesthetic, especially in an industry as young as ours with so little widely accepted guidance. We need to follow that pragmatic approach of continuing to learn, but at the same time we have to continue to question our intuition. And just as important, we should take the science/engineering of our industry seriously, even while recognizing it's limitations.

Software is hard, be careful how much you trust your instincts!

Monday, March 12, 2012

Simple Made Easy

Simple Made Easy
"Rich Hickey emphasizes simplicity’s virtues over easiness’, showing that while many choose easiness they may end up with complexity, and the better way is to choose easiness along the simplicity path."

I absolutely recommend you take the hour to watch this presentation. It's pretty easy viewing, he's funny, and I found it very influential.

Highlights
"Your ability to reason about your program is critical to changing it without fear." This has been something I've firmly believed for a very long time, but I love how succinctly Hickey puts it here. He even has the courage to challenge the two most popular practices of Software Engineering today: Agile, and TDD. For Agile, he's got this line: "Agile and XP have shown that refactoring and tests allow us to make change with zero impact. I never knew that, I still do not know that." Agile is supposed to make the fact of change one of the primary motivators behind how the project is run, but it doesn't really make applying that change any easier in the code... For TDD he has this wonderful quip:

"I can make changes 'cause I have tests! Who does that?! Who drives their car around banging against the guard rails saying, "Whoa! I'm glad I've got these guard rails!"

He calls it guard rail programming. It's a useful reminder that while tests are definitely valuable, they can't replace design and thoughtful coding.

Another very enlightening comment he made had to do with the difference between enjoyable-to-write code and a good program. This rang very true with me, probably because of all the Ruby bigots these days who are obsessed with succinct or "beautiful" code, but are still writing big balls of mud. Hickey basically said he doesn't care about how good of a time you had writing the program. He cares about if it's complexity yields the right solution, and can be reasoned about/maintained.

Which leads to another concept he brings up of Incidental Complexity vs. Problem Complexity. The argument that the tools you choose to use in your software can bring along extra complexity that has nothing whatsoever to do with the actual problem your program is supposed to solve.

Hickey Says I'm Wrong
I just wrote a series of posts where I was attempting to question some of the assumptions behind many of what are commonly considered good design practices in static object-oriented languages today:

I covered alot of stuff in that series. One of the things I was really challenging is the practice of hiding every object behind an interface. I argued this indirection just made things more complicated. At about 50 minutes in, Rich Hickey says every object should only depend on abstractions (interfaces) and values. To depend on a concrete instance is to intertwine the "What" with the "How" he says. So, he's saying I'm wrong.

I also talked about how Dependency Injection is leaky and annoying. But Rich Hickey says you want to "build up components from subcomponents in a direct-injection style, you want to, as much as possible, take them as arguments", and you should have more subcomponents than you probably have right now. So, yeah, I'm wrong.

I didn't actually blog about this one, but I've certainly talked about it with alot of people. I've been a proponent of "service layers" because I want my code to be as direct as possible. I want to be able to go one place, and read one code file, and understand what my system does. For example if I send an email when you create a task, I want to see that right there in the code. But Hickey says it's bad to have object A call to object B when it finishes something and wants object B to start. He says you should put a queue between them. So, wrong again!

I'm also a proponent of Acceptance Test Driven Development (ATDD) and writing english specs that actually test the system. Hickey says that's just silly, and recommends using a rules engine outside your system. :(

And finally, and this is the biggest one, he says:

"Information IS simple. The only thing you can possible do with information is RUIN it! Don't do it! We got objects, made to encapsulate IO devices. All they're good for is encapsulating objects: screens and mice. They were never supposed to be applied to information! And when you apply them to information, it's just wrong. And it's wrong because it's complex. If you leave data alone, you can build things once that manipulate data, and you can reuse them all over the place and you know they are right. Please start using maps and sets directly."

Um, yeah, ouch. I'm an object oriented developer. I read DDD and POEAA three years ago and got really excited about representing all my information as objects! We extensively prototyped data access layers, Entity Framework and NH chief among them. We settled on NH. Worked with it for awhile but found it too heavy handed. It hid too much of SQL and clung too much to persistence ignorance. But I couldn't really understand how to use a Micro-ORM like Massive (or Dapper or PetaPoco) because I was too hung up on the idea of Domain Objects. So we spiked an ORMish thing that used Massive under the covers. It supported inheritance and components and relationships via an ActiveRecord API. It gave us the flexibility to build the unit testing I always wanted (which I recently blogged about). It is still working quite well. But it's information represented as objects. So it's wrong...

In case you didn't pick up on it, Rich Hickey wrote Clojure, a functional language. I don't know anything about functional programming. I've been meaning to learn some F#, but haven't gotten that into it yet. So it doesn't really surprise me that Hickey would think everything I think is wrong. Functional vs. OOP is one of the biggest (and longest running) debates in our industry. I think it is telling that I've felt enough pain to blog about lots of the things that Hickey is talking about. But I don't find it disheartening that his conclusions are different than mine. It is possible that he is right and I am wrong. It is also possible that we are solving different problems with different tools with different risks and vectors of change and different complexities. Or, maybe I really should get rid of all my active record objects and just pass dictionaries around!

In any case, this certainly was a very eye opening presentation.

Monday, February 20, 2012

Meaning Over Implementation

Conveying the meaning of code is more important than conveying the implementation of code.

This is a lesson I've learned that I thought was worth sharing. It may sound pretty straight forward, but I've noticed it can be a hurdle when you're coming from a spaghetti code background. Spaghetti code requires you to hunt down code to understand how the system works because the code is not 'meaningful.'

There are really two major components to this. The first is, if you're used to spaghetti code, you've been trained to mistrust the code you're reading. That makes it hard to come to terms with hiding implementation details behind well named methods, because you don't trust those methods to do what you need unless you know every detail of HOW they do it.

The second is in naming. Generally I try to name methods so that they describe both what the method does as well as how it does it. But this isn't always possible. And when it's not possible, I've learned to always favor naming what the method is for as opposed to how the method is implemented.

There are a number of reason why this works out. The first is that issue of trust. If your code base sucks, you're forced to approach every class and every method with a certain amount of mistrust. You expect it to do one thing, but you've been bitten enough times to know it's likely it does some other thing. If this is the kind of code base you find yourself in, you're screwed. Every programming technique we have at our disposal relies on abstraction. And abstraction is all about hiding complexity. But if you don't trust the abstractions, you can't build on them.

The other reason this works is the same reason SRP (the Single Responsibility Principle) works. You can selectively pull back the curtain on each abstraction as needed. This allows you to understand things from a high level, but then drill down one layer at a time into the details. I like to think of this as breadth-first programming, as opposed to depth-first programming.

So when you're struggling with naming things, remember your first priority should be to name things so that others will understand WHAT it does. They can read the implementation if they need to understand HOW it does it.

Friday, February 17, 2012

Abstraction: blessing or curse

Well this has been a fun week! It's the first time I've ever done a series of posts like this, and all in one week. But now it's time to wrap up and then rest my fingers for awhile.

All of this stuff has been about dealing with abstraction. Seriously, think about what you spend most of your time doing when you're programming. For me, I think it breaks down something like this:

50%: Figuring out why a framework/library/control/object isn't doing what I wanted it to do
30%: Deciding where to put things and what to name them
20%: Actually writing business logic/algorithms

So, yeah, those numbers are completely made up, but they feel right... And look at how much time is spent just deciding where things should go! Seriously, that's what "software engineering" really is. What should I name this? Where should I put this logic? What concepts should I express? How should things communicate with each other? Is it OK for this thing to know about some other thing?

This is abstraction. You could just write everything in one long file, top to bottom, probably in C. We've all been there, to some degree, and it's a nightmare. So we slice things up and try to hide details. We try to massage the code into some form that makes it easier to understand, more intuitive. Abstraction is our only way to deal with complexity.

But abstraction itself is a form of complexity! We are literally fighting complexity with complexity.

I think this is where some of the art of programming comes into play. One of the elements of beautiful code is how it's able to leverage complexity but appear simple.

All of our design patterns, principles, and heuristics (from Clean Code) can largely be viewed as suggestions for "where to put things and what to name them" that have worked for people in the past. It's more like an aural tradition passed from generation to generation than it is like the strict rules of Civil Engineering. This doesn't make any of these patterns less important, but it does help explain why good design is such a difficult job, even with all these patterns.

In this series of posts I've blabbed on about a number of different design topics, each time almost coming to some kind of conclusion, but never very definitively. That's because this has been sort of a public thought experiment on my part. I'm just broadcasting some of the topics that I have been curious about recently. And I think where I would like to leave this is with one final look at the complexity of abstraction.

Abstraction, by definition, is about hiding details. And anyone who has ever done any nontrivial work with SQL Server (or any database) will understand how this hiding of details can be a double edged sword. If you had to understand everything in it's full detail, you could never make it work. But sometimes those hidden details stick out in weird ways, causing things to not work as you were expecting. The abstraction has failed you. But this is inevitable when hiding details, at some point, one of those details with surprise you.

Indirection and abstraction aren't exactly the same thing, but they feel very related somehow, and I think Zed Shaw would forgive me for thinking that. Indirection, just like abstraction, is a form of complexity. And just like abstraction, it can be incredibly useful complexity. I assert that every time you wrap an implementation with an interface, whether it's a Header Interface or an Inverted Interface, you have introduced indirection.

In Bob Martin's Dependency Inversion Principle article, he lays out 3 criteria of bad design:

Rigidity: Code is hard to change because every change affects too many parts of the system
Fragility: When you make a change, unexpected parts of the system break
Immobility: Code is difficult to reuse because it cannot be disentangled from the current application

I think there is clearly a 4th criteria which is: Complexity: Code is overly complicated making it difficult to understand and work with. Every move in the direction of the first three criteria, is a move away from the fourth. Our challenge then, indeed our art, is to find the sweet spot between these competing extremes.

And so I ask you, is it a beautiful code type of thing:

To wrap every concept behind abstract objects (objects or data structures)?
To represent every behavior as a method of some stateful thing (service layers or oop)?
To hide every class behind an indirection in the form of an interface (header interfaces or inverted interfaces)?
To make every class reusable and plugable (DIP: loose or leaky)?

Thursday, February 16, 2012

DIP: Loose or Leaky?

In the last post I looked a bit at the Dependency Inversion Principle which says to define an interface representing the contract of a dependency that you need someone to fulfill for you. It's a really great technique for encouraging SRP and loose coupling.

The whole idea behind the DIP is that your class depends only on the interface, and doesn't know anything about the concrete class implementing that interface. It doesn't know what class it's using, nor does it know where that class came from. And thus, we gain what we were after: the high level class can be reused in nearly any context by providing it with different dependencies. And on top of that, we gain excellent separation of concerns, making our code base more flexible, more maintainable, and I'd argue more understandable. Clearly, the DIP is awesome!

But! I'm sure you were waiting for the But... Since we now have to provide our object with it's dependencies, we have to know:

Every interface it depends on
What instances we should provide
How to create those instances

This feels so leaky to me! I used to have this beautiful, self contained, well abstracted object. I refactor it to be all DIP-y, and now it's leaking all these details of it's internal implementation in the form of it's collaborators. If I had to actually construct this object myself in order to use it, this would absolutely be a deal breaker!

Fortunately, someone invented Inversion of Control Containers, so we don't have to create these objects ourselves. Or, maybe unfortunately, 'cause now we don't have to create these objects ourselves, which sweeps any unsavory design issues away where you wont see them...

What design issues? Well, the leaking of implementation details obviously! Are there others? Probably. Maybe class design issues with having too many dependencies? Or having dependencies that are too small and should be rolled into more abstract concepts? But neither of these is a result of injecting Inverted Interfaces, only the implementation leaking.

I do believe this is leaky, but I'm not really sure if it's a problem exactly. At the end of the day, we're really just designing a plugin system. We want code to be reusable, so we want to be able to dynamically specify what dependencies it uses. We're not forced to pass the dependencies in, we could use a service locator type approach. But this has downsides of it's own.

In the next post, I'll wrap this up by zooming back out and trying to tie this all together.

Wednesday, February 15, 2012

Header Interfaces or Inverted Interfaces?

Thanks to Derick Bailey for introducing me to this concept: Header Interfaces. A Header Interface is an interface of the same name as the class which exposes all of the class's methods, generally not intended to be implemented by any other classes, and usually introduced just for the purposes of testing. Derick compared these to header files in C++. In my earlier Interfaces, DI, and IoC are the Devil post, these are really the kinds of interfaces I was rallying against. (I'll use the ATM example from Bob Martin's ISP article throughout this post)

public class AtmUI : IAtmUI {...}

One of the things I really found interesting about this was the idea that there are different kinds of interfaces. So if Header Interfaces are one kind, what other kinds are there?

The first things that came to my mind were the Interface Segregation Principle and the Dependency Inversion Principle. ISP obviously deals directly with interfaces. It basically says that interfaces should remain small. "Small" is always a tricky word, in this case it's about making sure that the clients consuming the interfaces actually use all the methods of the interface. The idea being that if the client does not use all the methods, then you're forcing the client to depend on methods it actually has no use for. Apparently this is supposed to make things more brittle.

I said "apparently" because I've never directly felt this pain. I guess it really only comes into play if you have separate compilable components and you are trying to reduce how many things have to be recompiled when you make a change. I use .NET, so Visual Studio, and it's not a picnic for controlling compilable components... I think I have this requirement though, and just haven't figured out how to deal with it in .NET. But for the moment, lets assume we agree that ISP is a good thing.

public class AtmUI : IDepositUI, IWithdrawalUI, ITransferUI {...}

The DIP leads to a similar place as ISP. It tells us that higher level components should not depend on lower level components. There is some room for uncertainty here around which components are higher than others. For example, is the AtmUI a higher level than the Transaction? I'll go with no, because the Transaction is the actual driver of the application, the UI is just one of it's collaborators. Because of this, the DIP leads us to create separate interfaces to be consumed by each Transaction:

public class AtmUI : IDepositUI, IWithdrawalUI, ITransferUI {...}

So, maybe there are at least two types of interfaces: Header Interfaces, and what I'll coin Inverted Interfaces. In the last post I talked about the "Service Layer" pattern. It generally leads to the creation of what feel more like Header Interfaces. But this is tricky, because the only difference I can really find here is based on who owns the interface. An Inverted Interface is owned by the class that consumes the interface, and a Header Interface is owned by the class that implements the interface.

But sometimes the difference isn't really that clear cut. If you're TDDing your way through an application top-down in the GOOS style, the Service Layers are designed and created based on the needs of the "higher" level components. So the component, and it's interface, both spring into existence at the same time. So if the service only has one consumer right now, the interface feels very Header-y. On the other hand, it was created to fulfill the need of a higher level component; very Inverted-y.

But if someone else comes around and consumes the same service later: well now we have some thinking to do. If we reuse the interface, then I guess we've made it a Header Interface. Would Uncle Bob have said to create a new interface but make the existing service implement it? The lines are blurred because the components we're dealing with all reside within the same "package" and at least right now don't have any clear call to be reused outside this package.

Sadly, the introduction of these interfaces brings us back to Dependency Injection. So in the next post, I'll look at the Dependency Inversion Principle, and the consequences of these Inverted Interfaces.

Tuesday, February 14, 2012

Service Layers or OOP?

In the last post, I mentioned that my co-workers and I had settled on a design that was working very well for us, but that wasn't very "object-oriented," at least not in the way Bob Martin described it.

We evolved to our approach through a lot of TDD and refactoring and plain old trial and error, but the final touches came when we watched Gary Bernhardt's Destroy All Software screencasts and saw that he was using pretty much the same techniques, but with some nice naming patterns.

I don't know if there is a widely accepted name for this pattern, so I'm just going to call it the Service Layer Pattern. It's biggest strengths are it's simplicity and clarity. In a nutshell I'd describe it by saying that for every operation of your application, you create a "service" class. You provide the necessary context to these services, in our case as Active Record objects (NOTE: service does NOT mean remote service (ie, REST), it just means a class that performs some function for us).

So far so basic, the real goodness comes when you add the layering. I find there are a couple ways to look at this. The more prescriptive is similar to DDD's (Domain Driven Design) "Layered Architecture" which recommends 4 layers: User Interface, Application, Domain, and Infrastructure. From DDD:

The value of layers is that each specializes in a particular aspect of a computer program. This specialization allows more cohesive designs of each aspect, and it makes these designs much easier to interpret. Of course, it is vital to choose layers that isolate the most important cohesive design aspects.

In my Demonstrating the Costs of DI code examples the classes and layers looked like this:

SpeakerController (app)
 > PresentationApi (app)
   > OrdersSpeakers (domain)
   > PresentationSpeakers (domain)
     > Active Record (infrastructure)
   > Speaker (domain)
     > Active Record (infrastructure)

This concept of layering is very useful, but it's important not to think that a given operation will only have one service in each layer. Another perspective on this that is less prescriptive but also more vague is the Single Responsibility Principle. The layers emerge because you repeatedly refactor similar concepts into separate objects for each operation your code performs. It's still useful to label these layers, because it adds some consistency to the code.

Each of these services is an object, but that doesn't make this an object-oriented design. Quite the opposite, this is just well organized SRP procedural code. Is this Service Layer approach inferior to the OOP design hinted at by Uncle Bob? Or are these actually compatible approaches?

The OOP approach wants to leverage polymorphism to act on different types in the same way. Does that mean that if I have a service, like OrdersParties, that I should move it onto the Party object? What about the PartyApi class, should I find some way of replacing that with an object on which I could introduce new types?

There is a subtle but important distinction here. Some algorithms are specific to a given type: User.Inactivate(). What it means to inactivate a user is specific to User. Contrast that with User.HashPassword(). Hashing a password really has nothing to do with a user, except that a user needs a hashed password. That is, the algorithm for hashing a password is not specific to the User type. It could apply to any type, indeed to any string! Defining it on User couples it to User, preventing it from being used on any string in some other context.

Further, some algorithms are bigger than a single type. Ordering the speakers on a presentation doesn't just affect one speaker, it affects them all. Think how awkward it would be for this algorithm to reside on the Speaker object. Arguably, these methods could be placed on Presentation, but then presentation would have a lot of code that's not directly related to a presentation, but instead to how speakers are listed. So it doesn't make sense on Speaker, or on Presentation.

Some algorithms are best represented as services, standing on their own, distinctly representing their concepts. But these services could easily operate on Objects, as opposed to Data Structures. Allowing them to apply to multiple types without needing to know anything about those specific types. So I think the Service Layers approach is compatible with the OOP approach.

In the next post I'll take a look at how interfaces fit into this picture.

Monday, February 13, 2012

Objects or Data Structures?

Here's a great (and old) article from Bob Martin called Active Record vs Objects. You should read it. I think it might be one of the best treatments of the theoretical underpinnings of Object Oriented design I've read, especially because it pays a lot of heed to what OOP is good at, and what it's not good at.

Here's some of my highlights:

Objects hide data and export behavior (very tell-don't-ask)
Data structures expose data and have no behavior
Algorithms that use objects are immune to the addition of new types
Algorithms that use data structures are immune to the addition of new functions
Apps should be structured around objects that expose behaviors and hide the database

This all feels right and stuff, but it's all pretty theoretical and doesn't help me decide if my code is designed as well as it could be. And that's what I'm going to be writing about. In one post a day for the rest of the week I'll look at various elements of "good design," and try to fit the pieces together in some way I can apply to my code.

Good designers uses this opposition to construct systems that are appropriately immune to the various forces that impinge upon them.

He's talking about the opposition between objects and data structures in terms of what they're good for. So apparently a good designer is a psychic who can see the future.

But that is the hard part, how do you know where you'll need to add "types" vs. where you'll need to add "functions"? Sometimes it's really obvious. But what I'm starting to think about is, maybe I need to get more clever about the way I think about types. Because if Uncle Bob thinks apps should be structured around objects, that means he thinks there are lots of examples where you're going to need to add a new type. Whereas, when I think about my code, I'm not really finding very many examples where I could leverage polymorphism to any useful effect.

This could simply be because the problems I'm solving for my application are simply better suited for data structures and functions. Or it could be because I'm just not approaching it from a clever enough OO angle.

Recently, my co-workers and I had pretty well settled on a design approach for our application, and it has been working extremely well for us. However, this article and it's clear preference for objects and polymorphism has me wondering if there may be another perspective that could be useful. I'll talk more about this in the next post.

Monday, July 12, 2010

The Analysts Dilemma

What's the hardest part of software development?

Too vague? Lets make it multiple choice:
A. Architecture
B. Code design
C. Algorithms
D. Business Analysis
E. Data Structures

If you answered anything other than D then you're an idiot. Seriously, look at the title of the post! How could you NOT know that D was the right answer. This isn't some open debate, this is more like high school, and I'm the teacher on this blog, and whatever I say is the right answer is the right answer. It doesn't matter what you think! Much like the relationship an analyst has with the customer.

This is what makes analysis the hardest part of software development. You really really want everything to be lined up in nice neat logical rows so that you can build the software in nice neat modules. But those damn users just refuse to do things logically and neatly! And despite how much you try to have it your way, you just keep getting Cs and Ds. Ultimately you have to give in and just give the users what they want. Embrace the wrinkles, the complexity, and the real world.

This is the picture of the world usually painted by Agile and DDD, and it's almost correct. Because it is true:

The real world is complicated
You can't dramatically simplify how your users work
You have to make your users happy

Don Norman, author of the great book The Design of Everyday Things, talks about this in his Business of Software talk. As he says there, the real world is NOT logical. But then he goes on to talk about what makes Analysis really hard: you can't trust what your users tell you.

That may sound harsh, but no matter how you cut it, it's true. Don Norman doesn't come out and say that, but he tells a story which I've seen happen first hand many times. If you ask people how they do their office work, and you write down everything they say, and then you read it back to them, they will completely agree with its accuracy. But when you go and watch them actually doing the work, you'll see that what they told you isn't what they're doing. If you ask why, the usual answer is because they are dealing with a special case. "We usually do it that way, but in this case I have to..."

So not only are we stuck analyzing something seemingly illogical that we can't force into a logical mold, we also can't rely on being given fully accurate information from the only people we can get information from! We. Are. Screwed.

And believe it or not, I can make it even more difficult for us! Because frequently the introduction of software doesn't just automate the manual process people have always performed, it actually changes the process. Meaning that as you go, you're making things that were once true, false. It's got some quantum mechanics flavor there.

So what are we supposed to do. The first thought is to try to get more accurate information up front, but this will never succeed. There will always be an edge case someone didn't think of. And trying to drill into nitty gritty details without anything solid to build on leads you to become focused on things that don't matter, and over design. Ultimately wasting time, and making it harder for you to respond to change when things inevitably do change.

Instead you have to do one, or both, of the following:

Teach your [users, customers, product owners, domain experts, etc] about the software side of things and get them intimately involved in the design and development of every aspect of the software. From architecture to process to UIs.
Aggressively shorten the feedback loop in any way possible. Get your designs, prototypes, early implementations, betas, and releases in the hands of users and make them work with them as quickly as you possibly can.

This is why it is so so so important to write agile code! "Agile" code is code which is easy to change, to some degree of easy. Once we embrace the fact that the hardest part of software development is analysis, and that truth be told analysis is basically impossible, we realize the most important thing for our code is to be able to respond to change. This has some dramatic implications on how we approach code: understanding becomes more important than execution or writing speed. This is why DDD, BDD, and SOLID are so important!

In the end, we have to stop thinking of analysis as something that happens once at the beginning of a project. Instead we have to minimize how much time we spend up front, and actually use our code as a tool to help figure out what the customer actually needs the software to do. We have to get to the point where learning something new from a [customer, user, product owner, domain expert, etc] doesn't cause us to grumble and complain about how no one ever tells us the right stuff. Its time we owned up to the fact that this is how the real world works, and stop moaning about it, and start expecting it, and finding ways to turn it to our advantage.

Monday, June 28, 2010

Simplicity vs. Adaptability

When dealing with code, and code architecture and design, there are lots of factors which have to be weighed to determine what the best way to go is. The popularity of Ruby on Rails in the blogosphere and the conference circuit has begun to shift the conversation about these factors a bit. When we were focused on Java and .NET we spent lots of time talking about The Gang of Four, Fowler, SOLID, and SOA. These days we seem to be talking more about BDD, "simplicity", terseness, and productivity.

I think this is a good thing, but I also think it's because we have finally realized that we are writing a whole new class of applications now. Back in the day, people were focused on BIG and COMPLICATED applications for banking and shipping and other complicated industries. We are still doing that kind of work today of course, but we've added a whole new class of application that didn't exist before: small and simple web applications. These websites are actual applications, not just brochure sites, so they have logic and models and all the rest. But their domain tends to be small, and the rules tend to be much simpler.

It seems like what we've learned as an industry is that all the patterns and practices that have been developed for dealing with large and complicated systems aren't necessarily needed for smaller web applications. But many of these things have become so ingrained in the way that we think and the way we approach problems that it can be a rather jarring shift to throw them out.

Ultimately this comes down to a question of Simplicity vs. Adaptability.
Simplicity: straight forward, few layers
Adaptability: Guards against change, includes more abstraction

Are These Really Opposing Characteristics?
The best possible design would be one that is both simple and adaptable, but usually simplicity and adaptability are opposing characteristics. This is because to make something adaptable, you tend to have to build in more layers and more abstraction, and that necessarily makes it more complicated.

For example, the Active Record pattern is simpler than the Data Mapper pattern. But the Data Mapper pattern isolates your models from changes in the database and vice versa, as well as removing all persistence knowledge from the models themselves.

There is a fun catch-22 here though. If your abstractions can serve as effective methaphors, you can begin to ignore the complexity the abstraction hides from you. This allows you to think about the system in a much simpler way, even though the details of it are very complex. But its debatable whether we would call a system like this "simple." For example, websites written with Ruby on Rails tend to be simple, but I would not describe Rails itself as simple.

Simplicity is exemplified by DHH, creator of Ruby on Rails
Adaptability is exemplified by Jeremy D Miller, creator of FubuMVC

Just look at the difference in their tag lines:
"Ruby on Rails is an open-source web framework that's optimized for programmer happiness and sustainable productivity."
FubuMVC: "Compositional, compile safe, convention-based configuration for complex web applications."

The focus of these two projects is clearly different, and its hard to argue with either one. Both can be used to create "simple" web applications. But Rails is very focused on a certain subset of simple web apps, where Fubu is more interested in being adaptable in order to allow you to tailor it to your needs.

There is a trade-off here. And which trade is right for your project is one of the most important decisions you have to make. As I wrote recently, the factors you have to consider in this trade-off frequently aren't even technical! So we really need to understand how diverse our industry has become, and we need to understand the context in which these things are being discussed.

Monday, May 24, 2010

Rails has no place at the office

This is a milestone post for me! My first ever purposefully incendiary title!

I should probably run with it and try to get everybody super offended, to the point where you have no idea what my point is because all you can see is red. I guess I'll have to leave that for a future milestone...

Because, yeah, I'm not really serious. Rails has a place at the office. And no, this isn't going to be one of those "Is Rails ready for the Enterprise?" posts. Rails is perfectly ready for use in the Enterprise, but that's the wrong question. As usual, the right question is much more complicated.

To start with, let me point out that this conversation has nothing to do with Ruby vs. C#. It doesn't really have anything to do with Rails vs ASP.NET MVC either. Instead I'm going to be talking about Active Record vs. Data Mapper, and View-Models vs. no View-Models, and this general concept of "the straight and narrow" vs. explicit abstraction and control. These are design patterns which apply to any language and appear in many different frameworks.

Rob Conery recently wrote a blog post in which he said,

For a lot of .NET/Java devs this will look "messy" - you shouldn't elevate "data concerns" into your model. This argument makes good sense for a large, complex site - that you're building in C# or Java. Typically Ruby focuses on the straight, narrow path and with that comes a dramatic turn towards "doing what you need to do... and no more". This resonates with me...

The part about Ruby/Rails focusing on "the straight and narrow path" really struck a chord for me. Ruby, being a dynamic language, is very much on the "straight and narrow." It dispenses with all kinds of things found in strongly typed languages like private, internal, protected, interfaces, etc. These are things that are usually considered very important in a strongly typed language, and practices like DDD, but Ruby doesn't really bother with them. Ruby favors documentation and convention over strict control.

Rails has a similar story. It uses the Active Record pattern for its data access, which requires a 1-1 correspondence with your database. Further, the models don't even really exist! They're built dynamically from the schema of the database tables.

If you compare ASP.NET MVC to Rails one of the differences you'll quickly discover is this concept of a "View-Model". ASP.NET peeps seem to like these, whereas I haven't found a Rails sample anywhere that uses these. Both Active Record and this lack of a View-Model are accomplishing the same thing: removing abstraction in favor of directness and simplicity.

Now lets step back from this for a second and ask a question. Who in there right mind would want to have to deal with things like class and method visibility and extra layers of abstraction, which more often than not appear to be just duplication? No one! No one would want to deal with these things! It's extra work! I _hate_ extra work!

So why do we do it? Why does DDD make a big deal out of private constructors and Factories? Why does Fowler recommend the Data Mapper pattern over Active Record? Why do we create View-Models to separate our Views from our Models? Why do we do all these things that seem to just make life more complicated? Why don't we all take the straight and narrow path on all of our projects all of the time?

Certainly it's not as simple as the language we're using. Just because you're writing in C# and Java doesn't mean you can't use Active Record. And it doesn't mean you can't pass your Model straight to your View. There is also nothing about C# or Java that forces you to use interfaces, or follow the Dependency Inversion Principle. That said, there's also no reason why you couldn't use the Data Mapper pattern in Ruby, or create View-Models. The language certainly HELPS with some of these issues, but it's not the real difference. These are just patterns, and they apply equally well to any language.

The reason why we introduce this complexity and divert from the straight and narrow path in our technical approach is actually due primarily to non-technical reasons. Here are some of the reasons I think lead us to adopt these "enterprise" patterns:

There are more than two or three developers on the project
You have more than 6 entities in the domain
The project has a timeline longer than 3 months
The developers aren't intimately familiar with the domain
The project is likely to grow in fits and starts
The team members are more likely to come and go

These are not technical issues but they have technical IMPLICATIONS!

The practices prescribed by DDD are a big deal if you're working with a large complicated domain with lots of potential for change. If you're not, then you don't need DDD. Fowler's enterprise patterns are a big deal for the same reasons. If you know things are complicated, likely to change, and not possible for everyone on the team to grok completely, then you need to build abstraction into your code. And you need to try to be as explicit as you possibly can about what the code does and how it works. And you need to look for opportunities to prevent error and misunderstanding before it happens. These things will allow you to keep things clean, organized, and ultimately make your project successful when you're faced with "enterprise" challenges.

This is obvious. I'm sure you're sitting there (or standing) thinking, "duh!" or "when is this dude going to get to the point?" or "does this moron really think this is revolutionary?!"

My point is as simple as this. Rails is awesome. Simplicity is awesome. But as I sit here in my ivory tower looking out over the landscape I see lots of quiet subtle backlash from people against the "enterprise-y" patterns in favor of the simplicity of Rails. This makes a lot of sense to me because, as we pointed out, who would WANT to deal with the complexity of enterprise problems and patterns? But it is easy to be tempted by the appeal of simple solutions to simple projects. And certainly we should always strive to find the simplest solution that could possibly work. But we can not close our eyes to the complexities of the problem or the environment in which we are solving the problem. And we cannot allow ourselves to be boiled alive either.

So by all means, choose the right tool for the right job, but make sure you understand the job as well as you understand the tool.