kwblog: September 2009

Tuesday, September 29, 2009

DDD: Performance Trade Offs

Domain Driven Design is a powerful technique composed of a lot of powerful patterns. It's focus is on designing rich object models that help control Domain complexity, like I talked about previously.

Before I stumbled on DDD I was trying to switch from stored procedure/record set based development to NHibernate/object model based development. One of the first stumbling blocks I encountered was what I refer to as "loading the object web." In a typical domain model your objects will have a lot of associations with each other. If you represent each of these with a property traversal (ex: Employee.Company.Branches.Sales...) things start to get messy.

The first problem you run into is in loading your Model from the database. To load any one Entity you have to load all of it's associations, and all those associations' associations, etc. This is a no go. The first way to handle this is through lazy loading, so the associations aren't loaded until you try to use them. That works, but it's a bit sloppy and it is not expressed in the Domain Model.

Another problem with this object web is determining how to persist changes and what records to lock. I have found database locking issues to be ridiculously complicated and amazingly ignored in the blog-o-sphere. For more on locking and its various solutions you should read Martin Fowler's Patterns of Enterprise Application Architecture. Fortunately DDD has a pattern we can apply to help address the web of objects problem: Aggregates.

The idea is simple, an Aggregate is a group of tightly related Entities. Evans says that all objects within the Aggregate should be loaded together and persisted together. He also says that objects outside the Aggregate can only have a reference to the Aggregate Root.

Roots are now explicitly indicated in our Model and they indicate where the boundaries between the objects in our object web are. This simplifies life considerably and it will also help with our database locking issues.

That's a lot of talk and I haven't even started talking about performance trade offs yet. Well, fortunately that really wont take long. If you are coming from the ad-hoc, off the cuff, procedural style, "Transaction Script" (as Martin Fowler would say) programming world like I am you may be alarmed by the Aggregate concept. You are used to executing a diverse array of procedures (filled with business logic) that return a specific subset of data: only what you need right now. But with DDD and the Aggregate, you're now going to return all the data for the Aggregate. This may be more data than you think you NEED at this point in your code.

There's your performance trade off. Retrieving more data when you don't necessarily need it. Why is this worth it? One simple reason is it allows you to enforce all your invariants (read: consistency rules) in the model all the time. You load all that data so that you can make sure all your data is in a consistent state at all times.

This also allows you to avoid triggers. You don't need a trigger in the database because your Model will do whatever the trigger would have done. Avoiding triggers is a performance benefit of DDD. Ironically, I've even seen that loading the Aggregate at once would have REDUCED the number of trips to the database in my code. This is because usually you don't just load something and be done with it. You tend to have to work with it. And there was a lot of duplication and repeated calls in the way I used to write my code that goes away once I have a Model in memory.

So sometimes the DDD approach can actually be more performant. But to be sure, other times it WILL do more work and WILL retrieve more data than the "Smart UI" (as Evans would say). But its worth it for the increased consistency and peace of mind that you gain, which is what allows you to deal with greater complexity.

Monday, September 21, 2009

DDD: Always Valid Model Objects

One of the rules of DDD (Domain Driven Design) is that your model objects should never be in an inconsistent state. For example, if you have a Contact object and Last Name and Social Security Number are required attributes, DDD says you can not create a Contact without specifying Last Name and SSN.

When I first heard this rule it caused me to dismiss DDD right out of hand. My thinking was, “How dumb! How am I going to bind the Contact to the UI if the Contact can’t be inconsistent? After all, the UI will be inconsistent all the time.”

It turns out the answer to my question is pretty simple, you don’t bind the Contact to the UI. If you’re following the Model View View-Model pattern, you bind a View-Model to the UI, then when the user clicks the save button you use the properties on the View-Model to build a Contact. If there is anything wrong an exception is thrown and you display a validation error through the UI.

In a way this kind of sucks. You have more hoops to jump through now.

You have to “duplicate” the properties of your Model in your View-Model
You have to “map” between the View-Model and Model properties
Constructing a Model object can become somewhat difficult

Eric Evans addresses #3 with the Factory Pattern. Unfortunately he doesn’t say anything about #1 and #2. Interestingly regarding #1 and #2 though, if you were developing for the web you would be forced into this situation regardless because your Model is on the server while your UI is on the client. There would be no way to “reuse” your Model in the UI.

This lack of “reuse” certainly will cause more code to have to be written, but I would like to take a paragraph to say I think this is really a good thing. For one thing, to data bind to your UI you may have to make certain accommodations like changing data types or the structure of your properties. You don’t want concerns like this affecting the design of your Model. Another downside to this “reuse” is you are tightly coupling your UI to the Model. If refactorings happen in the Model they will affect all your views. If you add the View-Model layer only the View-Models will have to be updated (depending on what the refactoring was of course…).

The real question is are these hoops worth it? What’s so good about making sure your Model is always consistent? Simply put, this is an effective way to manage complexity. If your rules for “consistency” are simple, this can seem overkill. But as those rules get more and more complicated, especially as they begin to apply to more than one object in your model, making sure you’re consistent becomes drastically more difficult. By requiring all objects to always be valid, you’re taking a huge load of uncertainty off the shoulders of the Application Developer and you’re making a clear and consistent rule that all Model Developers must follow.

Thus once again bringing that DOMAIN complexity into check, even if it does require more code in the Application layer.

Monday, September 14, 2009

DDD: How to tackle complexity

In DDD (Domain Driven Design) you create an object model that represents your application's domain. This model contains all the relationships and logic of that domain. The purpose of this is to make the complexity of the domain manageable. DDD involves lots of patterns and concepts, but when distilled there are a couple big picture things that really serve to tackle the complexity:

Explicit representation of domain concepts
Continuous Refactoring to "Deeper Insight"

The thing about complexity is its complex. Complex things are hard to understand. If they are hard to understand, they are difficult to get right the first time. And as hard as they are the first time, they're exponentially harder the second, and the third, and you get the idea.

This is the real issue: complex software is hard to understand. That's what leads to a system which everyone is so afraid to update, they would rather start from scratch. Maybe at first you're willing to just go in and add a hack or two. But each hack raises the complexity and increases the ugliness of the code until finally its just not worth trying anymore. That's another way of saying FAIL.

So we must come up with a way to deal with this complexity. The first way DDD does this is by taking advantage of the power of Object Orientation, models, and abstraction. But that's a bit too broad. We need to figure out how to structure those objects and models. That's where DDD applies the idea of explicitly representing domain concepts.

The idea is simple. If there is an import concept in your domain, you should be able to see it in the Model. You shouldn't have to dive into the code to extract important concepts, they should be represented by an object in the Model. Say there is some action in your domain that can only be taken when certain conditions are met. If these conditions are minor, you can just add if statement guards to the method that performs the action. But if these conditions are an important part of the domain, hiding them away in the code isn't good enough. Instead of an introduce a Policy Object that represents the conditions that must be met for that action to be performed. Now the conditions are explicitly represented in your domain.

You'll see this same idea expressed in Factories, Repositories, Services, Knowledge Levels, etc. Its a huge part of what it takes to make your system understandable.

The second idea that makes DDD work is Continuously Refactoring to "Deeper Insight." The deeper insight bit means if you discover something new about your domain after you already have a model, you don't just bolt it on. You figure out if that new thing indicates something fundamentally important about your domain. If it does, you refactor the model to explicitly represent that new understanding. Sometimes these refactorings will be small. Other times they will be big and cost a lot. It doesn't matter, you do them anyway.

If you don't your model will lose its expressive character and become more and more brittle. More and more complex. Harder and harder to understand. So you have to fight to always keep your model as simple and expressive and accurate as you possibly can. If you're lucky, these refactorings can lead to what Eric Evans calls a Breakthrough, where new possibilities or insights suddenly appear that would have been impossible before. That may take some real luck. If you're not that lucky, these refactorings will at least lead to a model that is flexible in the places where the domain requires flexibility. That means it will be easier to handle future insights and refactorings.

The really really cool thing about this is these two concepts form a cycle and feed each other.

The more you Refactor the more Explicit your model becomes. The more Explicit your model, the easier it is to Refactor!

Tuesday, September 8, 2009

Models

Dictionary.com defines Epiphany as "a sudden, intuitive perception of or insight into the reality or essential meaning of something, usually initiated by some simple, homely, or commonplace occurrence or experience." This definition does not include a feeling of "DUH!" or "DOH!" or "Damn, I'm an idiot, how did I not see that sooner?!" though all of those certainly apply to the Epiphany I had a few weeks ago.

In this case, that moment of insight revolves around the concept of a Model. MVC, MVP, MVVM: each pattern starts with Model but what is a Model? For me, it had been nothing more than a data house, a class with get/set properties and MAYBE a couple of recurring little utility methods (format date or whatever). In other words, it was a class representation of the database. It was my LINQ-to-sql or Entity Framework data classes. Or my model classes in Ruby on Rails, which represent and map to the database.

That is a model, it's a model of the database, but is that really a useful thing? Is that what your Model in MVC, MVP, MVVM is supposed to be? The Epiphany came in the form a question from one of my co-workers. He asked, "Where's the OO?"

Indeed, where is the object orientation in my code? And that's when it hit me, there is none.

I mean, I write in C# so technically everything is an object. But I don't have any classes that exist to model a real world thing or concept. They just aren't there. I have utility/framework classes that exist to support the application infrastructure itself. Classes like StoredProcedure and Parameter, etc. These weren't a big leap, I'm basically just copying the .NET Framework. But I'm not writing a Framework, that's not my job. My job is to write a business application for my client, and I don't have a single object dedicated to modeling the client's business.

So effectively, when it comes to the domain, I'm writing procedural code. I might as well be writing in C.

How could this have happened? When I look back at the projects I did in college I was modeling all over the place. I had graphs and nodes and simulations and swarm agents and they were all objects, all models. But when I entered the business programming world and was confronted with Guis and Databases, I didn't transfer any of that. The main challenges then appeared to be 1. dealing with the UI and 2. dealing with the database. So all my efforts were set about those areas, the actual domain logic that the application was really there for was just scattered around.

Enter Domain Driven Design: The key point that I simply didn't understand is that Object Oriented Programming is about modeling the world. Yes, it's about abstraction and encapsulation, but those are really tools used to create a Model.

This is obvious, and if you had said it to me I would have thought I'd always known it and wouldn't have grasped the significance. The significance is that it's a drastic change in perception. If you're writing software, your software is doing something for someone. I would have asked the question, "what is it doing?" But a more appropriate question is "what does it model?" The "doing" is then included in the answer, along with the relationships and the data elements and the rules etc...

And now that you're modeling, you're taking advantage of all the benefits of object oriented programming. There is a very good reason why every modern language is object oriented.

In Domain-Driven Design: Tackling Complexity in the Heart of Software Eric Evans talks about all the reasons why modeling the domain is so important, and also introduces all kinds of software patterns that help make modeling the domain possible. There are two quotes that I think really explain the importance of this concept of modeling and indicate why it is such a shift of perception.

Model-Driven Design has limited applicability using languages such as C, because there is no modeling paradigm that corresponds to a purely procedural language. Those languages are procedural in the sense that the programmer tells the computer a series of steps to follow. Although the programmer may be thinking about the concepts of the domain, the program itself is a series of technical manipulations of data. The result may be useful, but the program doesn't capture much of the meaning. Procedural languages often support complex data types that begin to correspond to more natural conceptions of the domain, but these complex types are only organized data, and they don't capture the active aspects of the domain. The result is that software written in procedural languages has complicated functions linked together based on anticipated paths of execution, rather than by conceptual connections in the domain model.

The last sentence is the biggie for me. I've never heard a better description than "complicated functions linked together based on anticipated paths of execution." That perfectly describes what my code has looked like.

Evans also has this to say:

In an object-oriented program, UI, database, and other support code often gets written directly into the business objects. Additional business logic is embedded in the behavior of UI widgets and database scripts. This happens because it is the easiest way to make things work, in the short run.

When the domain-related code is diffused through such a large amount of other code, it becomes extremely difficult to see and to reason about. Superficial changes to the UI can actually change business logic. To change a business rule may require meticulous tracing of UI code, database code, or other program elements. Implementing coherent, model-driven objects becomes impractical. Automated testing is awkward. With all the technologies and logic involved in each activity, a program must be kept very simple or it becomes impossible to understand.

Not only is it the easiest way, it's the way the Microsoft tools encourage you to work. I now understand why the ALT.NET community was so pissed about Entity Framework. But that's the standard Microsoft approach: catering to the least common denominator. And as long as what you're doing is simple, you'll be fine. In fact, that's probably the right way to go because you'll have less code, less overhead, and still be able to understand and change it.

But I haven't worked on an app that was that simple, in, well ever. And I've been focusing so heavily on the technology, database access and the UI framework, that I forgot the real complexity was in the domain.

Of course now that I understand this I have to figure out how to write applications with a rich Domain Model layer. How do you persist changes from the domain? How do you do change tracking in the domain? How do different elements of the domain communicate with each other? How does a rich domain oriented application work with a service layer? And most important of all, how do you start adding a rich domain to an application that was written without one? Hopefully the DDD book will have the answer to most of these questions.