Here's another post about some of the more advanced features of Mercurial. The last two posts were about the Mq extension, which allows you to maintain a queue of patches with distinct changes in them. This allows you to keep separate changes separate, and to test those changes separately.
The trick with Mq though is that you have to know ahead of time that you are going to make separate changes and you have to go create a new patch for those changes. Before I go any further, I want to stress that this is actually a good thing! You should proactively keep your changes separate. This helps ensure you don't bite off too much at once, or fail to test you changes.
But sometimes you forget to be proactive about separating your changes and then you want to untangle them after the fact. You can do this using the Record extension. This is another extension that ships with Mercurial, but you have to enable it to use it.
The Record extension is invoked with "hg record" and will go through the files you've changed (or just the ones you tell it to look at) and will iterative over each change "hunk" in the file. A change hunk is a set of sequentially changed lines. For each hunk it asks you if you want to record the change or not. You simply say "y" if you want it, and "n" if you don't.
The result is a new commit that contains the changes you recorded. It's a surprisingly easy process.
However, there's a flaw with this approach. The resulting commit might be broken: it might not compile, or the tests might not pass. And since it turns directly into a commit, you don't really get the chance to test it... Now you could qnew the other changes in your working directory, then pop that patch to get back to the last commit, and test. And if you notice any issues you can qimport the change that record created in order fix the problems. Then you can qfinish it, and qpush to get your pending changes back. Follow that?
Or instead of using hg record which results in a new commit, you can use hg qrecord which results in a new mq patch. You'll still need to qnew the working directory changes. But now you wont have to qimport if you need to make any changes to the original patch. I prefer this method because it seems more sane that you end up with mutable patches after prying changes apart instead of finished commits.
Monday, April 4, 2011
Monday, March 28, 2011
Mercurial Mq: Modify a changeset
Mercurial is an awesome distributed version control system. If you work on a project that cares about clean changesets, you may run into the need to modify a changeset after you have committed it. This generally isn't allowed, and it's a potentially dangerous thing to do.
It's dangerous because changing history modifies the identity of the repository. So if you pushed the changeset, then modified it, your repository will no longer be compatible with the remote one. And that would be bad.
But if you haven't shared the changeset yet, you CAN modify it. All you need is to enable the mq extension. You can use this extension to turn an existing changeset into a patch, and then you can modify that patch. When you're done, you can finish the patch, turning it back into a finalized changeset. You can also edit more than one changeset this way by importing them, then popping them off the queue one at a time.
Here's an example:
If you want to update the changeset comment, you can do that by editing the respective .diff file in the .hg\patches directory. For example, .\hg\patches\123.diff. Just open that sucker up, modify the comments on top and save it when you're done. Couldn't be simpler!
It's dangerous because changing history modifies the identity of the repository. So if you pushed the changeset, then modified it, your repository will no longer be compatible with the remote one. And that would be bad.
But if you haven't shared the changeset yet, you CAN modify it. All you need is to enable the mq extension. You can use this extension to turn an existing changeset into a patch, and then you can modify that patch. When you're done, you can finish the patch, turning it back into a finalized changeset. You can also edit more than one changeset this way by importing them, then popping them off the queue one at a time.
Here's an example:
hg qimport -r 123 -r 124 hg qpop # make changes to files hg qrefresh hg qpush # make changes to files hg qrefresh hg qfinish -a
If you want to update the changeset comment, you can do that by editing the respective .diff file in the .hg\patches directory. For example, .\hg\patches\123.diff. Just open that sucker up, modify the comments on top and save it when you're done. Couldn't be simpler!
Monday, March 21, 2011
Mercurial Mq
Mercurial is an awesome distributed version control system, abbreviated "hg." Hg's command approach emphasizes small well named commands to make it easier to learn and understand. It also comes "safe" and "easy" out of the box, but allows you to simply turn on built-in extensions to gain all kinds of shoot-yourself-in-the-foot power.
One of those extensions is called Mq. This allows you to manage a set of patches in a queue. You may be asking yourself, "What the hell does that mean?!" Instead of describing how it works, I'll describe when and how you'd use it.
Suppose you are developing a new feature. You're changing code here, changing code there, adding tests here and there, and so on. But suddenly you notice some code that could use improving. But that code doesn't have anything to do w/ the feature you are working on. What do you do?!
You could just change it. But if you do, you will have two unrelated changes in the same changeset. This isn't the end of the world... But it's not good either. For one thing, you might introduce a bug. And now when someone is trying to track down that bug, it will be really non-intuitive that your changeset may have introduced that bug. Basically, tangling changes is just bad. It means you're moving too fast, doing too many things at once. So let's not just change it.
You could hope you'll remember about it for later. Or you could write it down and hope you'll see your note and come back to it. But, come on, that ain't gonna happen.
OR you could use the mq extension. mq will allow you to take all the changes you've made on your feature and put them in a patch in a queue. Then create a new patch in the queue, and improve the unrelated code. Now go back to the first patch and keep working. When you're all done, "finish" all the patches in the queue and they turn into permanent changesets! ta da!
Here's what that little story would look like, assuming it's the first time you've ever used mq in this repo:
You move between patches with hg qpop and qpush. qpop takes a patch out of your working directory and drops to the previous patch in the queue. qpush does the opposite, adding the changes of the next patch in the queue back into your working directory. You can have any number of patches, one after the other, in your queue. And, if you need to for some reason, you can reorder patches. Check out the MqExtension page for more. And after you enable it, "hg help mq" makes it very easy to learn all the various commands.
Very useful feature, and addictive once you get used to it!
One of those extensions is called Mq. This allows you to manage a set of patches in a queue. You may be asking yourself, "What the hell does that mean?!" Instead of describing how it works, I'll describe when and how you'd use it.
Suppose you are developing a new feature. You're changing code here, changing code there, adding tests here and there, and so on. But suddenly you notice some code that could use improving. But that code doesn't have anything to do w/ the feature you are working on. What do you do?!
You could just change it. But if you do, you will have two unrelated changes in the same changeset. This isn't the end of the world... But it's not good either. For one thing, you might introduce a bug. And now when someone is trying to track down that bug, it will be really non-intuitive that your changeset may have introduced that bug. Basically, tangling changes is just bad. It means you're moving too fast, doing too many things at once. So let's not just change it.
You could hope you'll remember about it for later. Or you could write it down and hope you'll see your note and come back to it. But, come on, that ain't gonna happen.
OR you could use the mq extension. mq will allow you to take all the changes you've made on your feature and put them in a patch in a queue. Then create a new patch in the queue, and improve the unrelated code. Now go back to the first patch and keep working. When you're all done, "finish" all the patches in the queue and they turn into permanent changesets! ta da!
Here's what that little story would look like, assuming it's the first time you've ever used mq in this repo:
# work on your feature, notice code that needs to be improved... hg init --mq #only needed first time you use mq hg qnew -m "my new feature" new-feature #automatically includes all working dir changes in the patch hg qnew -m "improved some code" improve-code # improve the code... hg qrefresh #adds working dir changes to current patch hg qpop #takes improve-code changes out of working dir, drops you back to just the new-feature changes # finish your feature... hg qrefresh #adds working dir changes to new-feature patch hg qfinish -a #converts all patches into permanent changesetsThis might look like a lot to you at first glance, but it's not. And in practice its surprisingly simple.
You move between patches with hg qpop and qpush. qpop takes a patch out of your working directory and drops to the previous patch in the queue. qpush does the opposite, adding the changes of the next patch in the queue back into your working directory. You can have any number of patches, one after the other, in your queue. And, if you need to for some reason, you can reorder patches. Check out the MqExtension page for more. And after you enable it, "hg help mq" makes it very easy to learn all the various commands.
Very useful feature, and addictive once you get used to it!
Monday, March 14, 2011
Mercurial Branches
One of the best things about a distributed version control system, like Mercurial, is how easy it is for many people to collaborate and share code and changes with each other.
In hg there are many ways of sharing changes. You can export changesets w/ the export command and send them to people. You can bundle up a bunch of changesets together with the bundle command. Or you can share your repository and other developers and can pull directly from you.
That last method where people can pull from you is the easiest, fastest, and generally the best. But it can be a real pain to make sure everyone you want can see your repo. This is where named branches can be useful.
In Mercurial you can "branch" by creating clones in "path space." This is the generally recommended way to go. But you can also create branches within the same repository without cloning a new repository. The benefit of this is you can push and pull those branches through a central repository or any other shared repository. This makes is much easier to share changes with people! It's also more efficient, both from a network traffic and disk space perspective.
For example, you could start work on a new feature in a named branch and make a fair amount of progress and decide you want some feedback from the team. You share your changes, and the team can then commit new changes and share them back to you.
If we were using clones instead of named branches, your feature changes would be in a cloned repository. If your team has a central repo, you wouldn't be able to push your clone there because you're not ready to "finalize" your work yet. So instead, you would have to host your clone so your team could pull from it. Then each team member would have to host THEIR cloned version of your clone so you could pull back their feedback.
With named branches you can leverage the shared repos you already have setup, in this case a central repo. You just create a branch, work on it, and push it to the central repo. Your team pulls from the central repo, makes updates to the branch, and pushes them back.
Here's what this would look like:
Your teammates would then do this:
When you're all done with the feature you can merge it back to the default branch, and close your feature branch. That looks like this:
If you don't close the branch it will remain listed in the hg branches command.
And that's all there is to named branches! Just hg branch to create them, hg update to move between them, hg merge to merge them back together again, and hg commit to work on them. Easy, fast, and efficient!
In hg there are many ways of sharing changes. You can export changesets w/ the export command and send them to people. You can bundle up a bunch of changesets together with the bundle command. Or you can share your repository and other developers and can pull directly from you.
That last method where people can pull from you is the easiest, fastest, and generally the best. But it can be a real pain to make sure everyone you want can see your repo. This is where named branches can be useful.
In Mercurial you can "branch" by creating clones in "path space." This is the generally recommended way to go. But you can also create branches within the same repository without cloning a new repository. The benefit of this is you can push and pull those branches through a central repository or any other shared repository. This makes is much easier to share changes with people! It's also more efficient, both from a network traffic and disk space perspective.
For example, you could start work on a new feature in a named branch and make a fair amount of progress and decide you want some feedback from the team. You share your changes, and the team can then commit new changes and share them back to you.
If we were using clones instead of named branches, your feature changes would be in a cloned repository. If your team has a central repo, you wouldn't be able to push your clone there because you're not ready to "finalize" your work yet. So instead, you would have to host your clone so your team could pull from it. Then each team member would have to host THEIR cloned version of your clone so you could pull back their feedback.
With named branches you can leverage the shared repos you already have setup, in this case a central repo. You just create a branch, work on it, and push it to the central repo. Your team pulls from the central repo, makes updates to the branch, and pushes them back.
Here's what this would look like:
hg branch new-feature # do some work and commit some changes... hg push -b new-feature --new-branch #push to central repoNote the --new-branch switch. If you don't include this switch hg will abort with a warning that the push will add new branches to the remote repo. This warning prevents you from accidentally sharing branches that haven't been shared yet. Also note that you don't have to include "-b new-feature". Hg will push all changesets in the repository by default.
Your teammates would then do this:
hg pull hg up new-feature # review and add feedback commits... hg pushIf the default branch has had commits added since new-feature was branched, the pull command here will print a message telling you that 1 head has been added to the repository. This head is on the new branch. If you run hg merge at this point it will abort telling you that branch 'default' only has one head. If you want to actually merge branches, you have to explicitly give it the branch name, as we'll see next.
When you're all done with the feature you can merge it back to the default branch, and close your feature branch. That looks like this:
hg up default hg merge new-feature hg ci -m "merge" hg up new-feature hg ci --close-branch -m "close" #closes the branch hg up default
If you don't close the branch it will remain listed in the hg branches command.
And that's all there is to named branches! Just hg branch to create them, hg update to move between them, hg merge to merge them back together again, and hg commit to work on them. Easy, fast, and efficient!
Monday, March 7, 2011
Craft over Art
I'm slowly working through Apprenticeship Patterns. Kind of a dry book, but it does have a few interesting concepts. One I particularly enjoyed was the "Craft over Art" pattern. The chapter opens with a quote from Richard Stallman: "I would describe programming as a craft, which is a kind of art, but not a fine art. Craft means making useful objects with perhaps decorative touches. Fine art means making things purely for their beauty."
It goes on with "As a craftsman you are primarily building something that serves the needs of others, not indulging in artistic expression. After all, there's no such thing as a starving craftsman. As our friend Laurent Bossavit put it: 'For a craftsman to starve is a failure; he's supposed to earn a living at his craft.'... If your desire to do beautiful work forces you out of professional software development and away from building useful things for real people, then you have left the craft."
"Part of the process of maturation encompassed by this pattern is developing the ability to sacrifice beauty in favor of utility if and when it becomes necessary."
I found this to be an incredibly accurate and valuable discussion. One of the most difficult balancing acts in programming is that between building what your customer needs, and building what you wish they needed, or even just what you want to build. Sometimes its hard to tell the difference between the two. Other times, you know the difference, but it pisses you off!
There is another balancing act that comes up a lot: that between the quality you WANT to build and the quality your user wants. This goes both ways, but generally we tend to want to build at a higher quality than our users think they want. "When using this pattern you will have to balance your customer's desire for the immediate resolution of their problem with the internal standards that make you a craftsman." This is especially important for people influenced by the craftsmanship movement. Sometimes craftsmanship comes across as a pursuit for perfection over a pursuit for utility.
For me, this Craft over Art pattern was a good reminder to stay focused on delivering high quality utility for my users.
It goes on with "As a craftsman you are primarily building something that serves the needs of others, not indulging in artistic expression. After all, there's no such thing as a starving craftsman. As our friend Laurent Bossavit put it: 'For a craftsman to starve is a failure; he's supposed to earn a living at his craft.'... If your desire to do beautiful work forces you out of professional software development and away from building useful things for real people, then you have left the craft."
"Part of the process of maturation encompassed by this pattern is developing the ability to sacrifice beauty in favor of utility if and when it becomes necessary."
I found this to be an incredibly accurate and valuable discussion. One of the most difficult balancing acts in programming is that between building what your customer needs, and building what you wish they needed, or even just what you want to build. Sometimes its hard to tell the difference between the two. Other times, you know the difference, but it pisses you off!
There is another balancing act that comes up a lot: that between the quality you WANT to build and the quality your user wants. This goes both ways, but generally we tend to want to build at a higher quality than our users think they want. "When using this pattern you will have to balance your customer's desire for the immediate resolution of their problem with the internal standards that make you a craftsman." This is especially important for people influenced by the craftsmanship movement. Sometimes craftsmanship comes across as a pursuit for perfection over a pursuit for utility.
For me, this Craft over Art pattern was a good reminder to stay focused on delivering high quality utility for my users.
Monday, February 28, 2011
A Bit About Pointe Blank
I'm excited to announce I recently got a promotion at work! I'm now Pointe Blank Solution's Software Engineering Manager, which basically means I'm in charge of our development team. The line is: responsible for setting technical objectives, development team processes and practices, and fostering a productive and engaging environment. What makes this extremely exciting for me is both the products we are building, and what we are trying to grow into.
We are a fairly small company with 11 developers and 20 people all told, and we're growing. Our focus is on building software that changes the way industries get their work done. We typically partner with a client and build a product much like a consulting firm would. But then we go on to productize that software and sell it to a broader market. We don't do short term contracts and we're not a body shop. If the software isn't going to make a big impact on our clients and our community, we won't build it. Our two main projects at the moment are in justice and health care. Just about the most complicated industries you can imagine, so we have no shortage of interesting challenges to work through, both business and technical.
Our entire organization believes that good code is a competitive advantage. As a result, our development team is dedicated to Acceptance Test Driven Development in Ruby with Cucumber at the feature level, BDD with NUnit at the unit level, and Clean Code practices throughout. And we try to use the latest technologies including ASP.NET MVC 3, C# 4, CSS 3, HTML 5, jQuery, and Ruby. We use code reviews to maintain our high standard of quality and to share knowledge between team members.
We are building infrastructure around maintaining a rapid feedback loop and enabling the Boy Scout Rule. Part of this is a framework of supporting code and tools we call Nails. This includes everything from Continuous Integration, to one click automated deployment tools, to custom code generation tools (and most of that tooling is written in Ruby). For example, a single push to the central code repository automatically builds the source, runs the unit tests, runs the Cucumber tests against a built-from-scratch test database, migrates the beta database, and deploys the beta website. And that's just the build server. We're also building tooling to support the TDD feedback loop. We want developers to be able to focus on building value, not doing tedious configuration work or boilerplate coding.
We are building infrastructure around maintaining a rapid feedback loop and enabling the Boy Scout Rule. Part of this is a framework of supporting code and tools we call Nails. This includes everything from Continuous Integration, to one click automated deployment tools, to custom code generation tools (and most of that tooling is written in Ruby). For example, a single push to the central code repository automatically builds the source, runs the unit tests, runs the Cucumber tests against a built-from-scratch test database, migrates the beta database, and deploys the beta website. And that's just the build server. We're also building tooling to support the TDD feedback loop. We want developers to be able to focus on building value, not doing tedious configuration work or boilerplate coding.
We believe in passion, craftsmanship, and continuous learning. To further that, we're currently in the process of putting together a meetup. Basically, we're going to take our internal meeting, and open it up to the public. Hopefully I'll have more news about that soon.
We are about a year into most of this, so it is constantly evolving as we learn and keep improving. So, this is a very exciting time at Pointe Blank! I'm writing all this because 1) I'm exciting about it and 2) I think we are a very unusual company, especially in this part of the country, and I want to get the word out. Plus, I kinda just wanted to brag...
We are about a year into most of this, so it is constantly evolving as we learn and keep improving. So, this is a very exciting time at Pointe Blank! I'm writing all this because 1) I'm exciting about it and 2) I think we are a very unusual company, especially in this part of the country, and I want to get the word out. Plus, I kinda just wanted to brag...
Tuesday, January 18, 2011
Strings in Ruby vs. C#
In Ruby:
In C#:
Strings in C# are immutable. So trying to change a string actually creates a new string. So updating s2 does not update s1.
Strings in Ruby are mutable. So strings in Ruby act like pointers and s1 is updated when you update s2.
UPDATE 1/19/2011:
To be more clear, here are some more examples of how Ruby behaves:
s1 = "string" s2 = s1 s2 << "a" s1.should == s2
In C#:
string s1 = "string"; string s2 = s1; s2.Insert( 0, "a" ); Assert.NotEqual( s1, s2 );
Strings in C# are immutable. So trying to change a string actually creates a new string. So updating s2 does not update s1.
Strings in Ruby are mutable. So strings in Ruby act like pointers and s1 is updated when you update s2.
UPDATE 1/19/2011:
To be more clear, here are some more examples of how Ruby behaves:
s1 = "string" s2 = s1 s2 += "a" s2.should_not == s1
s1 = "string"
s2 = s1
s2.gsub!('s','z')
s2.should == s1
Monday, January 10, 2011
Withholding Information
I read this blog post titled Team Trap #5: Withholding Information the other day. It tells the story of a team brainstorming meeting in which the team is eliminating ideas. When "Harry"'s idea is eliminated, Harry takes it as a personal attack and detaches from the meeting. The author's take is that by withdrawing from the meeting and not saying anything about his emotional state to the team Harry is withholding information that the team needs to function well.
That said, I think it's especially important for programmers to keep this in mind because we have a tendency to expect people to be rational, and we don't react well when they aren't. People aren't machines, and if you're going to build a strong team it's important to remember that.
Also worth noting, "It's just business" is bullshit. Work can't be done well without emotion. But you do have to manage those emotions. Yours, and everyone else's.
This sort of thing happens all the time. One member of the team feels like he’s not being heard, or isn’t valued and withdraws. The rest of the group goes on, discusses, makes decisions, starts to act. The team is missing out on the intelligence, creativity and participation of that member. They won’t have his buy-in for decisions, and won’t have his full-hearted support for action. When situations like this aren’t handled, relationship fracture and drains away. When you’re part of team, you need to be willing to say what’s going on for you, so that the team stays healthy and connected.Now, if everyone took every opportunity to treat things as personal attacks and started telling the team how their emotions had been hurt we'd never get any work done. But it is true that this kind of thing happens. And it happens to everyone at one time or another.
That said, I think it's especially important for programmers to keep this in mind because we have a tendency to expect people to be rational, and we don't react well when they aren't. People aren't machines, and if you're going to build a strong team it's important to remember that.
Also worth noting, "It's just business" is bullshit. Work can't be done well without emotion. But you do have to manage those emotions. Yours, and everyone else's.
Tuesday, January 4, 2011
Learning to Focus
You can't program well, efficiently, and successfully unless you can focus.
The enemy of focus is distractions: your boss walking in, your phone ringing, your co-workers talking to you, emails, IMs, tweets, text messages. These are distractions that actively steal your focus. There are also distractions that you create yourself: reading Google Reader, Facebook, Twitter, and Reddit, talking to your neighbors, working on too many things at once, and so on.
These kinds of distractions need to be managed:
The last two are especially important. Scheduling regular breaks as in the Pomodoro Technique helps you to stay focused during your work periods. When you know you have a break coming up its easier to put off answering messages, and reading crap on the internet.
Expecting to be interrupted by phone calls keeps you from getting frustrated when it happens. It also means you have to work in small increments and keep note of where you're at so when you do get interrupted, you can get back into it more easily.
Hard Work
This is all fine and good, but at the end of the day the hardest part of focusing is that it is hard work. When I first attempted the Pomodoro Technique I couldn't believe how hard it was to work for 20 minutes straight. I had no idea how often I was allowing myself to be interrupted, by active interruptions like phone calls and people talking to me, but mostly by interruptions I created myself like reading crap online and talking to other people.
Actionable
Another important element of staying focused is you need to know what you're working on with enough detail to actually DO SOMETHING. This shows up in Getting Things Done, when it talks about managing your tasks it recommends you write down both the task and the first actionable step to completing the task. There is a somewhat subtle but important distinction there.
Focus vs. Interruption Roles
Focus gets very difficult when you have different job responsibilities too. For example, if you are expected to program and manage a project. Programming is a role that requires you to be "in the zone", "in flow", "plugged in". In other words, focused. But managing a project is an interruption driven role. Answering peoples questions, meetings, reviews. Interruption driven roles don't work well with focus driven roles. To make this work, you have to find ways to set aside time to focus without skimping on your interruption requirements.
Value Your Focus
But no matter what your role is, its absolutely crucial that you understand the importance of focus and that you take your own ability to focus seriously.
The enemy of focus is distractions: your boss walking in, your phone ringing, your co-workers talking to you, emails, IMs, tweets, text messages. These are distractions that actively steal your focus. There are also distractions that you create yourself: reading Google Reader, Facebook, Twitter, and Reddit, talking to your neighbors, working on too many things at once, and so on.
These kinds of distractions need to be managed:
- Turn off email notifications and check them less frequently.
- If you are on IM, mark yourself busy when you're working.
- If someone walks in or calls, tell them you need 20 minutes to wrap up.
- Limit the number of things you are actively working on at one time.
- Schedule breaks.
- Expect to be interrupted.
The last two are especially important. Scheduling regular breaks as in the Pomodoro Technique helps you to stay focused during your work periods. When you know you have a break coming up its easier to put off answering messages, and reading crap on the internet.
Expecting to be interrupted by phone calls keeps you from getting frustrated when it happens. It also means you have to work in small increments and keep note of where you're at so when you do get interrupted, you can get back into it more easily.
Hard Work
This is all fine and good, but at the end of the day the hardest part of focusing is that it is hard work. When I first attempted the Pomodoro Technique I couldn't believe how hard it was to work for 20 minutes straight. I had no idea how often I was allowing myself to be interrupted, by active interruptions like phone calls and people talking to me, but mostly by interruptions I created myself like reading crap online and talking to other people.
Actionable
Another important element of staying focused is you need to know what you're working on with enough detail to actually DO SOMETHING. This shows up in Getting Things Done, when it talks about managing your tasks it recommends you write down both the task and the first actionable step to completing the task. There is a somewhat subtle but important distinction there.
Focus vs. Interruption Roles
Focus gets very difficult when you have different job responsibilities too. For example, if you are expected to program and manage a project. Programming is a role that requires you to be "in the zone", "in flow", "plugged in". In other words, focused. But managing a project is an interruption driven role. Answering peoples questions, meetings, reviews. Interruption driven roles don't work well with focus driven roles. To make this work, you have to find ways to set aside time to focus without skimping on your interruption requirements.
Value Your Focus
But no matter what your role is, its absolutely crucial that you understand the importance of focus and that you take your own ability to focus seriously.
Monday, November 29, 2010
Making TDD Work
The key to making TDD work is to follow three principles:
1. Write SOLID style, clean code
2. Focus on behavior
3. Always start with the tests
Learning TDD (Test Driven Development) has been a surprisingly difficult process for me over the last few years. Once I figured out how to think like a TDDer I was pretty well on my way, but it didn't really end there. The main issue is that bad tests can be worse than no tests at all. The hard part is figuring out what makes a test bad and what makes a test good. This post is my attempt to convey what I currently think are some good guidelines for writing good tests.
SOLID and Clean
All the code you write has to be good clean code following the SOLID principles. Whether the code is a test or "real" code, it has to be good. If the code isn't clean, well designed, and easy to understand you don't stand a chance.
Some of the things SOLID and Clean Code encourage seem counter intuitive at first. For example, the concept of creating small single purpose methods, especially when they aren't intended to be reused, seems like it might make it harder to find things, name things, and understand things. Of course, the truth is the exact opposite. Creating small, single purpose, well named methods dramatically simplifies your code. The same is true for creating small, well defined classes. If you are at all like me you'll have to see it in action to really believe it. Even today there are times when I have the mental debate of whether to apply a method refactoring, and every time I do it, the code always ends up better.
Following these principles is absolutely essential to making TDD work. It is impossible to test big ball of mud code, it's difficult to test code that isn't single purposed and well defined, and it's enormously difficult to understand tests written against a code base that isn't SOLID and Clean. The more understandable the code under test is, the more understandable the tests will be. In fact, I firmly believe that if you're doing it right, the code under test should end up being so simple you will find yourself questioning if the tests are worth having at all!
Behavior
The tests should focus on the behavior of the system (Behavior Driven Development or BDD). In fact, I would go so far as to say that if you're thinking about code coverage when you're writing tests, you're doing it wrong. Thinking about code coverage leads to tests with names that can't be understood without an intimate knowledge of the implementation and which are a nightmare to maintain. Code coverage can be a useful metric, but it shouldn't guide your tests.
If you instead focus on the behavior of the object you are testing, you should still discover that you approach 100% code coverage, but you will do so in a way that results in meaningful, descriptive, and understandable tests. This is critically important, because tests are not something you write once and never deal with again! Naming and organization is crucial here. Typically I follow the When, With, Should naming pattern in my unit tests. For example, I may have tests like:
You know you're done testing when you've covered all the scenarios. And if you find a bug, it should be fairly straight forward to decide what scenario or test is missing.
Tests First
Finally, you should always start with the tests. When you're writing something new, start with the tests. When you're fixing a bug, start with the tests. When you're adding new behavior, start with the tests. When you're doing a code review, start with the tests. When you're trying to understand how some code works or what its for, start with the tests.
There are times when you wont be able to fully implement the tests first. For example, if you're writing code that uploads a file in ASP.NET MVC and you don't know how MVC provides the file data, you wont know what to mock or what to expect until you dive into the code. That's totally OK. But you should still frame out the tests first, then come back and implement them (just make sure that the tests fail when expected).
Writing the tests first forces you to think through the behavior you're attempting to build and often helps you see flaws in your object design and algorithms. Writing the tests first will also encourage you to write single purpose, clean, and well defined code. Helping you arrive at that point where the code under test is so simple the tests seem almost pointless. It's possible to end up with code that is simple without tests, but it's much harder.
When modifying existing code, starting with the tests will help you notice when the design has changed enough to warrant refactoring the code and redefining the roles of your objects.
1. Write SOLID style, clean code
2. Focus on behavior
3. Always start with the tests
Learning TDD (Test Driven Development) has been a surprisingly difficult process for me over the last few years. Once I figured out how to think like a TDDer I was pretty well on my way, but it didn't really end there. The main issue is that bad tests can be worse than no tests at all. The hard part is figuring out what makes a test bad and what makes a test good. This post is my attempt to convey what I currently think are some good guidelines for writing good tests.
SOLID and Clean
All the code you write has to be good clean code following the SOLID principles. Whether the code is a test or "real" code, it has to be good. If the code isn't clean, well designed, and easy to understand you don't stand a chance.
Some of the things SOLID and Clean Code encourage seem counter intuitive at first. For example, the concept of creating small single purpose methods, especially when they aren't intended to be reused, seems like it might make it harder to find things, name things, and understand things. Of course, the truth is the exact opposite. Creating small, single purpose, well named methods dramatically simplifies your code. The same is true for creating small, well defined classes. If you are at all like me you'll have to see it in action to really believe it. Even today there are times when I have the mental debate of whether to apply a method refactoring, and every time I do it, the code always ends up better.
Following these principles is absolutely essential to making TDD work. It is impossible to test big ball of mud code, it's difficult to test code that isn't single purposed and well defined, and it's enormously difficult to understand tests written against a code base that isn't SOLID and Clean. The more understandable the code under test is, the more understandable the tests will be. In fact, I firmly believe that if you're doing it right, the code under test should end up being so simple you will find yourself questioning if the tests are worth having at all!
Behavior
The tests should focus on the behavior of the system (Behavior Driven Development or BDD). In fact, I would go so far as to say that if you're thinking about code coverage when you're writing tests, you're doing it wrong. Thinking about code coverage leads to tests with names that can't be understood without an intimate knowledge of the implementation and which are a nightmare to maintain. Code coverage can be a useful metric, but it shouldn't guide your tests.
If you instead focus on the behavior of the object you are testing, you should still discover that you approach 100% code coverage, but you will do so in a way that results in meaningful, descriptive, and understandable tests. This is critically important, because tests are not something you write once and never deal with again! Naming and organization is crucial here. Typically I follow the When, With, Should naming pattern in my unit tests. For example, I may have tests like:
[Test]
public void WhenTransferingMoneyBetweenAccounts()
{
var srcAccount = WithAccountHavingBalance( 100 );
var destAccount = WithAccountHavingBalance( 100 );
accountTransferService.Transfer( 50, srcAccount, destAccount );
Assert.AreEqual( 50, srcAccount.Balance, "Source account should have balance of 50" );
Assert.AreEqual( 150, destAccount.Balance, "Destination account should have balance of 150" );
}
[Test]
public void WhenTransferingAmountLargerThanSourceBalance() {...}
Each test represents a certain scenario. The first block of code sets up the scenario. The second block performs the action we're testing. The third block verifies everything worked as expected. This style, inspired in part by mspec and rspec, makes each test's responsibility obvious and helps describe all the scenarios and their expected behavior.You know you're done testing when you've covered all the scenarios. And if you find a bug, it should be fairly straight forward to decide what scenario or test is missing.
Tests First
Finally, you should always start with the tests. When you're writing something new, start with the tests. When you're fixing a bug, start with the tests. When you're adding new behavior, start with the tests. When you're doing a code review, start with the tests. When you're trying to understand how some code works or what its for, start with the tests.
There are times when you wont be able to fully implement the tests first. For example, if you're writing code that uploads a file in ASP.NET MVC and you don't know how MVC provides the file data, you wont know what to mock or what to expect until you dive into the code. That's totally OK. But you should still frame out the tests first, then come back and implement them (just make sure that the tests fail when expected).
Writing the tests first forces you to think through the behavior you're attempting to build and often helps you see flaws in your object design and algorithms. Writing the tests first will also encourage you to write single purpose, clean, and well defined code. Helping you arrive at that point where the code under test is so simple the tests seem almost pointless. It's possible to end up with code that is simple without tests, but it's much harder.
When modifying existing code, starting with the tests will help you notice when the design has changed enough to warrant refactoring the code and redefining the roles of your objects.
Tuesday, October 12, 2010
Vim: Escape
UPDATE: Scroll all the way to the bottom for the best VIM ESC mapping in the world!
Vim is a modal editor, meaning it has modes. It's modes are command mode, and edit mode (AKA normal mode and insert mode). Edit mode allows you to add and delete text and move your cursor around. Command mode allows you to execute various commands that can operate on the text.
Most editors only have edit mode. To execute commands you use shortcut keys that must depend on non-printable keys like CTRL, ALT, SHIFT. You quickly run out of simple combinations (ex: CTRL+S), so you have to introduce more complicated key combinations (ex: CTRL+K, CTRL+C). Visual Studio calls these "chords." I call it keyboard gymnastics.
Some people like the "chord" approach, and other people prefer the mode approach. I like the mode approach because if I'm going to execute non-trivial commands, I'd like to type a command. The main downside to the mode approach is you have to get back and forth between the modes. When you launch Vim and open a file you start out in command mode. There are a bunch of useful different ways to get into edit mode, all of which are just a single keystroke: i, a, o, O, I, A.
Once in edit mode, getting back to command mode is a bit harder because now typing a key will print the character, so Vim uses ESC to leave edit mode and go back to command mode. Sadly, on modern keyboards this goes against everything Vim stands for because the ESC key is so far away from the home row. And if you're using Vim properly, you'll be going back and forth between these modes A LOT. So we need a better way.
Vim has an alternative built right in: CTRL+[
(You've mapped your Caps Lock key to Control, right?)
This is kind of an uncomfortable keystroke, but with practice you will get used to it. If you'd rather not, you CAN map your own keyboard combination to Esc. For example, you could try CTRL+space:
Whatever you use, I strongly suggest you try to stop using the Escape key.
UPDATE:
A number of people in the comments suggested using various combinations of the j and k keys. I didn't even realize you could remap printable characters, but you can! I've been using this for quite a while now and I'm completely addicted! I call it the "smash":
Add this mapping to your vimrc:
Now all you have to do is smash the j and k key. It doesn't matter which you type first! Just smack them both as if it was a single key! Absolutely brilliant!
Vim is a modal editor, meaning it has modes. It's modes are command mode, and edit mode (AKA normal mode and insert mode). Edit mode allows you to add and delete text and move your cursor around. Command mode allows you to execute various commands that can operate on the text.
Most editors only have edit mode. To execute commands you use shortcut keys that must depend on non-printable keys like CTRL, ALT, SHIFT. You quickly run out of simple combinations (ex: CTRL+S), so you have to introduce more complicated key combinations (ex: CTRL+K, CTRL+C). Visual Studio calls these "chords." I call it keyboard gymnastics.
Some people like the "chord" approach, and other people prefer the mode approach. I like the mode approach because if I'm going to execute non-trivial commands, I'd like to type a command. The main downside to the mode approach is you have to get back and forth between the modes. When you launch Vim and open a file you start out in command mode. There are a bunch of useful different ways to get into edit mode, all of which are just a single keystroke: i, a, o, O, I, A.
Once in edit mode, getting back to command mode is a bit harder because now typing a key will print the character, so Vim uses ESC to leave edit mode and go back to command mode. Sadly, on modern keyboards this goes against everything Vim stands for because the ESC key is so far away from the home row. And if you're using Vim properly, you'll be going back and forth between these modes A LOT. So we need a better way.
Vim has an alternative built right in: CTRL+[
(You've mapped your Caps Lock key to Control, right?)
This is kind of an uncomfortable keystroke, but with practice you will get used to it. If you'd rather not, you CAN map your own keyboard combination to Esc. For example, you could try CTRL+space:
inoremap <C-space> <Esc>
Whatever you use, I strongly suggest you try to stop using the Escape key.
UPDATE:
A number of people in the comments suggested using various combinations of the j and k keys. I didn't even realize you could remap printable characters, but you can! I've been using this for quite a while now and I'm completely addicted! I call it the "smash":
Add this mapping to your vimrc:
inoremap jk <esc> inoremap kj <esc>
Now all you have to do is smash the j and k key. It doesn't matter which you type first! Just smack them both as if it was a single key! Absolutely brilliant!
Tuesday, September 28, 2010
Questioning ORM Assumptions
I come from a stored procedure background to data access, with output parameters and datatables strewn throughout C# code. I have "recently" been learning ORMs (specifically NHibernate, and Entity Framework). I've done some prototyping with both, and used NHibernate on a few projects.
In 1-1 situations I have found it is incredibly nice to not have to write SQL, or virtually any data access code at all. In situations that required some form of mapping (components, inheritance, etc) it's also very nice, though things become more brittle and error prone. In fact, even in 1-1 situations, I've been surprised by how brittle NH mappings are. Change just about anything on your entity and you're likely to break your mapping somehow. But that seems to be a price it is worth paying to avoid writing SQL and manual mapping code.
However, I've recently been questioning some of the features ORMs bring. I think most people would consider these features absolute requirements of an ORM. However, I'm beginning to doubt how valuable they really are. Perhaps some of this is in reality more trouble than it's worth?
Unit of Work
The first pattern I have some issues with is the Unit of Work pattern. This is the pattern used by ORMs to allow you to get a bunch of objects from the ORM, make any changes you want, and then just tell the ORM to save. The ORM figures out what you changed, and takes care of it. There are two major benefits to this pattern:
1. You don't have to manually keep track of all the objects you changed in order to save them. The ORM will just know what you changed, and make sure it gets persisted.
2. You don't have to concern yourself with the order things get saved in. The ORM will automatically figure it out for you.
My first issue with this pattern is that it is not very intuitive. You have to tell the ORM about new objects, and you to tell it to delete objects, but you don't have to tell it to update objects. And, in fact, you don't have to tell it about ALL new objects as it will automatically insert some of them depending on how your mappings and objects are setup (Parent/Child relationships, for examples). It tends to be further confused by the APIs frequently used. For example, a lot of people use a Repository/Unit of Work pattern to hide NHibernate's session object.
Truth be told, some of this confusion is actually due to the Repository pattern. You are supposed to think of a Repository as an in memory collection of objects. The persistence is under the covers magic. If you're writing an application where persistence is one of the primary concerns, I always thought it was kind of stupid to adopt a pattern which tries to pretend that persistence isn't happening...
But back to Unit of Work, the second issue I have is a certain loss of control. It is very easy for you to write code using a unit of work and then have no idea what is actually being saved to the database when you issue the Save command. To me, that's a really scary thing. Now, to be fair, if you find yourself with code like that, it's probably really bad code. But that doesn't change the fact that this pattern almost encourages it. There is something nice about ActiveRecord's approach of calling Save on each entity you want to save to the database. You're certainly gaining back control.
My last issue, and this one isn't really that big of a deal, but is still something that bothers me a bit... The Unit of Work pattern couples the way you make changes to the transactions that are used to save them. In other words, you can't change object A and object B, then save A in one transaction and B in another. Instead, you'd have to change A, save it, change B, save it. Like I said, this is a minor sort of quibble, but demonstrates again the assumptions made by the UoW pattern which steals some of your control.
None of these issues are all that serious. But I still believe that Unit of Work is a very awakward way of dealing with your objects and persistence.
Lazy Loading
ORMs use Lazy Loading to combat the "Object Web" problem. The object web problem arises when you have entities that reference other entities that reference other entities that reference other entities that ... How do you load a single object in that web without loading the ENTIRE web? Lazy Loading solves the problem by not loading all the references up front. It instead loads them only when you ask for them.
NHibernate and Entity Framework use some pretty advanced and somewhat scary "dynamic proxy" techniques to accomplish this. Basically they inherit from your class at run time and change the implementation of your reference properties so they can intercept when they are accessed. There are some scenarios where this dynamic inheritance can cause you problems, but by and large it works and you can pretend its not even happening.
Lazy loading as a technique is very valueable. But I think ORMs depend on it too heavily. The problem with Lazy Loading is performance. Its easy to write code that looks like it executes a single query to the database, but in reality ends up executing 10 or more. At the extreme you have the N+1 select problem. Once again, it boils down to trying to pretend the data access isn't happening.
DDD's solution to the Object Web problem is Aggregates. An Aggregate is a group of entities. The assumption is that when you load an Entity all its members will be loaded. If you want to access another aggregate, then you have to query for it. This cleanly defines when you can use an object traversal, and when you need to execute a query. Basically, it forces you to remove some of the links in your object web.
By making Lazy Loading so easy, ORMs kind of encourage you to build large object webs. Entity Framework in particular because it's designer will automatically make your objects mimic the database if you use the db-first approach and drag and drop your tables into the designer. Meaning you will have every association and every direction included in your model.
While I don't have a problem with Lazy Loading, I do have a problem with using it too much. This is the main reason why you read so much about people "profiling" their ORM applications and discovering crazy performance problems. Personally, I'd rather put some thought into how I'm going to get my data from the persistance store up front then have to come back after the fact and waste tons of time trying to find all the areas where my app is executing a crazy number of queries needlessly.
Object Caching
NHibernate and Entity Framework keep a cache of the objects they load. So if you ask for the same object twice, they'll be sure to give you the same instance of the object both times. This prevents you from having two different versions of the same object in memory at the same time. If you think about that for awhile, I'm sure you'll come up with all kinds of horror scenarios you could get into if you had two representations of the same object.
But I think this is an example of the ORM protecting me from myself too much, its just not that important of a feature. Instead it adds more magic that makes the data access of my application even harder to understand. One time when I say GetById( 1 ), it issues a select. But the next time it doesn't. So if I actually wanted it to (to get the latest data for example), I now have to call Refresh()...
Wrap Up
I got into all this because I didn't want to write SQL and I didn't want to write manual mappings. I certainly got that. But I also got Unit of Work, Lazy Loading, and Implicit Caching. None of which I actually NEED and certainly never wanted. And many of which actually create more problems than I had before!
Some Active Record implementations manage to fix these issues. But I have concerns with using Active Record on DDD like code. The main concern is that I want to model my domain, not my database. The other big concern is I prefer keeping query definitions out of the entities, as it doesn't feel like their responsibility.
Now I'm not claiming any of these issues are a deal breaker to using NHibernate or Entity Framework or other ORMs. But on the other hand, it doesn't feel like these patterns are the best possible approach. I suspect there are alternative ways of thinking about Object Relational Mapping which may have some subtle affects on how we code data access and lead to better applications, developed more efficiently. For now though, I'm settling for NHibernate.
In 1-1 situations I have found it is incredibly nice to not have to write SQL, or virtually any data access code at all. In situations that required some form of mapping (components, inheritance, etc) it's also very nice, though things become more brittle and error prone. In fact, even in 1-1 situations, I've been surprised by how brittle NH mappings are. Change just about anything on your entity and you're likely to break your mapping somehow. But that seems to be a price it is worth paying to avoid writing SQL and manual mapping code.
However, I've recently been questioning some of the features ORMs bring. I think most people would consider these features absolute requirements of an ORM. However, I'm beginning to doubt how valuable they really are. Perhaps some of this is in reality more trouble than it's worth?
Unit of Work
The first pattern I have some issues with is the Unit of Work pattern. This is the pattern used by ORMs to allow you to get a bunch of objects from the ORM, make any changes you want, and then just tell the ORM to save. The ORM figures out what you changed, and takes care of it. There are two major benefits to this pattern:
1. You don't have to manually keep track of all the objects you changed in order to save them. The ORM will just know what you changed, and make sure it gets persisted.
2. You don't have to concern yourself with the order things get saved in. The ORM will automatically figure it out for you.
My first issue with this pattern is that it is not very intuitive. You have to tell the ORM about new objects, and you to tell it to delete objects, but you don't have to tell it to update objects. And, in fact, you don't have to tell it about ALL new objects as it will automatically insert some of them depending on how your mappings and objects are setup (Parent/Child relationships, for examples). It tends to be further confused by the APIs frequently used. For example, a lot of people use a Repository/Unit of Work pattern to hide NHibernate's session object.
var crypto = BookRepo.GetByTitle( "Cryptonomicon" );
crpto.Rating = 5;
var ender = new Book { Title = "Ender's Game", Author = "Orsan Scott Card" };
BookRepo.Add( ender );
uow.Save();
What happens at BookRepo.Add( ender )? Does that issue an Insert to the database? Is the crypto.Rating update saved? And where the heck did this uow object come from and what relationship does it have with the BookRepo?! If you know this pattern, you're probably so used to it that it doesn't seem strange. But when you step back from it, I think you'll agree this is a pretty bizarre API.Truth be told, some of this confusion is actually due to the Repository pattern. You are supposed to think of a Repository as an in memory collection of objects. The persistence is under the covers magic. If you're writing an application where persistence is one of the primary concerns, I always thought it was kind of stupid to adopt a pattern which tries to pretend that persistence isn't happening...
But back to Unit of Work, the second issue I have is a certain loss of control. It is very easy for you to write code using a unit of work and then have no idea what is actually being saved to the database when you issue the Save command. To me, that's a really scary thing. Now, to be fair, if you find yourself with code like that, it's probably really bad code. But that doesn't change the fact that this pattern almost encourages it. There is something nice about ActiveRecord's approach of calling Save on each entity you want to save to the database. You're certainly gaining back control.
My last issue, and this one isn't really that big of a deal, but is still something that bothers me a bit... The Unit of Work pattern couples the way you make changes to the transactions that are used to save them. In other words, you can't change object A and object B, then save A in one transaction and B in another. Instead, you'd have to change A, save it, change B, save it. Like I said, this is a minor sort of quibble, but demonstrates again the assumptions made by the UoW pattern which steals some of your control.
None of these issues are all that serious. But I still believe that Unit of Work is a very awakward way of dealing with your objects and persistence.
Lazy Loading
ORMs use Lazy Loading to combat the "Object Web" problem. The object web problem arises when you have entities that reference other entities that reference other entities that reference other entities that ... How do you load a single object in that web without loading the ENTIRE web? Lazy Loading solves the problem by not loading all the references up front. It instead loads them only when you ask for them.
NHibernate and Entity Framework use some pretty advanced and somewhat scary "dynamic proxy" techniques to accomplish this. Basically they inherit from your class at run time and change the implementation of your reference properties so they can intercept when they are accessed. There are some scenarios where this dynamic inheritance can cause you problems, but by and large it works and you can pretend its not even happening.
Lazy loading as a technique is very valueable. But I think ORMs depend on it too heavily. The problem with Lazy Loading is performance. Its easy to write code that looks like it executes a single query to the database, but in reality ends up executing 10 or more. At the extreme you have the N+1 select problem. Once again, it boils down to trying to pretend the data access isn't happening.
DDD's solution to the Object Web problem is Aggregates. An Aggregate is a group of entities. The assumption is that when you load an Entity all its members will be loaded. If you want to access another aggregate, then you have to query for it. This cleanly defines when you can use an object traversal, and when you need to execute a query. Basically, it forces you to remove some of the links in your object web.
By making Lazy Loading so easy, ORMs kind of encourage you to build large object webs. Entity Framework in particular because it's designer will automatically make your objects mimic the database if you use the db-first approach and drag and drop your tables into the designer. Meaning you will have every association and every direction included in your model.
While I don't have a problem with Lazy Loading, I do have a problem with using it too much. This is the main reason why you read so much about people "profiling" their ORM applications and discovering crazy performance problems. Personally, I'd rather put some thought into how I'm going to get my data from the persistance store up front then have to come back after the fact and waste tons of time trying to find all the areas where my app is executing a crazy number of queries needlessly.
Object Caching
NHibernate and Entity Framework keep a cache of the objects they load. So if you ask for the same object twice, they'll be sure to give you the same instance of the object both times. This prevents you from having two different versions of the same object in memory at the same time. If you think about that for awhile, I'm sure you'll come up with all kinds of horror scenarios you could get into if you had two representations of the same object.
But I think this is an example of the ORM protecting me from myself too much, its just not that important of a feature. Instead it adds more magic that makes the data access of my application even harder to understand. One time when I say GetById( 1 ), it issues a select. But the next time it doesn't. So if I actually wanted it to (to get the latest data for example), I now have to call Refresh()...
Wrap Up
I got into all this because I didn't want to write SQL and I didn't want to write manual mappings. I certainly got that. But I also got Unit of Work, Lazy Loading, and Implicit Caching. None of which I actually NEED and certainly never wanted. And many of which actually create more problems than I had before!
Some Active Record implementations manage to fix these issues. But I have concerns with using Active Record on DDD like code. The main concern is that I want to model my domain, not my database. The other big concern is I prefer keeping query definitions out of the entities, as it doesn't feel like their responsibility.
Now I'm not claiming any of these issues are a deal breaker to using NHibernate or Entity Framework or other ORMs. But on the other hand, it doesn't feel like these patterns are the best possible approach. I suspect there are alternative ways of thinking about Object Relational Mapping which may have some subtle affects on how we code data access and lead to better applications, developed more efficiently. For now though, I'm settling for NHibernate.
Tuesday, September 21, 2010
Decoupling tests with .NET 4
Recently, I was struggling with an annoying smell in some tests I was writing and found a way to use optional and default parameters to decouple my tests from the object under test's constructor. Not too long ago, Rob Conery wrote about using C#'s dynamic keyword to do all kinds of weird stuff. When I was running into these issues with those tests, I took a look at what he'd been playing with. Nothing there jumped out, but it lead to the optional parameters.
Specifically, I was TDDing an object that had many dependencies injected through the constructor. Basically what happened was each new test introduced a new dependency, which causes the previous tests to have to be updated. For example:
Then I write the next test:
You could simply refactor the constructor into a helper method so you only have to change it in one place. But this doesn't work so well when the tests are passing in their own mocks. You'd need lots of helper methods with lots of different method overloads. Enter optional and default parameters!
Specifically, I was TDDing an object that had many dependencies injected through the constructor. Basically what happened was each new test introduced a new dependency, which causes the previous tests to have to be updated. For example:
[Test]
public void Test1()
{
var testobj = new Testobj( new Mock<ISomething>().Object );
}
I'm using moq here, and creating a default partial mock, which takes care of itself.Then I write the next test:
[Test]
public void Test2()
{
var setupMock = new Mock<ISomething>();
// setup the mock
var testobj = new Testobj( setupMock.Object, new Mock<ISomethingElse>().Object );
}
Notice this test has introduced a new dependency, ISomethingElse. Now the first test wont compile, we have to go update it and add the mock for ISomethingElse. This will continue with each test that introduces a new dependency causing every previous test to be updated.You could simply refactor the constructor into a helper method so you only have to change it in one place. But this doesn't work so well when the tests are passing in their own mocks. You'd need lots of helper methods with lots of different method overloads. Enter optional and default parameters!
public Testobj BuildTestobj(Mock<ISomething> something = null, Mock<ISomethingElse> somethingElse = null )
{
return new Testobj(
( something ?? new Mock<ISomething>() ).Object,
( somethingElse ?? new Mock<ISomethingElse>() ).Object );
}
Now we can update the tests:[Test]
public void Test()
{
var testobj = BuildTestobj();
}
[Test]
public void Test2()
{
var setupMock = new Mock<ISomething>();
// setup the mock var testobj = BuildTestobj( something = setupMock );
}
Simple, clean, refactor friendly, and your tests are now nicely decoupled from the constructor's method signature!
Friday, August 6, 2010
Testing C# with RSpec and Ruby
Why would you want to do this?
Simple: More readability, less ceremony. This means you can write your tests faster, update them faster, understand them faster, and generally just be happier!
Why would you NOT want to do this?
Simple: Microsoft just dropped support for it...
I just spent the last week working on this stuff, so I'm just a little bit pissed at my timing. For example, this whole blog post was already written! Whether we should allow this to prevent us from considering using IronRuby is whole different issue. I think I'll just have to wait and see what happens before making that decision.
That said...
How do you do it?
I have some sample code you can browse to help you get started at http://bitbucket.org/kberridge/irspec.
I'm working with IronRuby 1.1.0.0 on .NET 4.0 and rspec 1.3.0. Know that if your versions are different, stuff may work differently.
Setting up the environment:
You can execute one test at a time by executing this command in the Specs folder:
To execute all the tests you can write a rakefile like this:
That last bit works great, unless you're executing IronRuby with the PrivateBinding flag turned on... More on that later.
More on spec_helper.rb
If you've never played with ruby before, this may be slightly weird. All of your tests will require 'spec_helper'. This effectively executes the code in spec_helper.rb (but only 1 time) allowing us to centralize configuration required by our tests, or even define helpful helper methods.
There are two things you should definitely do here: tell Ruby where to find your .NET assemblies and require common dependencies all tests will need.
Telling Ruby where to find your .NET assemblies is similar to telling cmd what directories to search by adding to your PATH variable. You do this in ruby by appending to the $: magic variable (aliased $LOAD_PATH):
Common stuff you'll want to require includes rubygems, spec, and your .NET dll you're trying to test:
More on PrivateBinding
If you're writing DDD style code in .NET, you probably have internal constructors and things which you need to be able to execute with your tests. If you were testing with NUnit, you'd setup your test assembly as a friend assembly of the assembly under test. assembly.
You can't do this when you're testing with IronRuby because there IS no assembly. So instead, you have to invoke IronRuby with the PrivateBinding flag as follows:
This works great for accessing your internals or privates in .NET. But sadly, there is currently a bug somewhere that causes rake to break. I posted to StackOverflow here but haven't found a solution yet. So be aware of that.
Simple: More readability, less ceremony. This means you can write your tests faster, update them faster, understand them faster, and generally just be happier!
Why would you NOT want to do this?
Simple: Microsoft just dropped support for it...
I just spent the last week working on this stuff, so I'm just a little bit pissed at my timing. For example, this whole blog post was already written! Whether we should allow this to prevent us from considering using IronRuby is whole different issue. I think I'll just have to wait and see what happens before making that decision.
That said...
How do you do it?
I have some sample code you can browse to help you get started at http://bitbucket.org/kberridge/irspec.
I'm working with IronRuby 1.1.0.0 on .NET 4.0 and rspec 1.3.0. Know that if your versions are different, stuff may work differently.
Setting up the environment:
- Install IronRuby (install it in C: instead of Program Files if you don't want to mess with your path to get gems working later on)
- igem install rspec
- optionally: igem install rake
- optionally: igem install caricature
- optionally: igem install flexmock
- Add a Specs folder (call it whatever you want) in your project's main folder (or wherever)
- Create spec_helper.rb in the Specs folder, more on this in a bit
- Create an examples folder (call it whatever you want)
- Add your test files in the examples folder
- All of your tests should require 'spec_helper'
You can execute one test at a time by executing this command in the Specs folder:
ir -S spec examples\first_test.rb
To execute all the tests you can write a rakefile like this:
require 'rake'
require 'spec/rake/spectask'
desc "Runs all examples"
Spec::Rake::SpecTask.new('examples') do |t|
t.spec_files = FileList['examples/**/*.rb']
end
require 'spec/rake/spectask'
desc "Runs all examples"
Spec::Rake::SpecTask.new('examples') do |t|
t.spec_files = FileList['examples/**/*.rb']
end
That last bit works great, unless you're executing IronRuby with the PrivateBinding flag turned on... More on that later.
More on spec_helper.rb
If you've never played with ruby before, this may be slightly weird. All of your tests will require 'spec_helper'. This effectively executes the code in spec_helper.rb (but only 1 time) allowing us to centralize configuration required by our tests, or even define helpful helper methods.
There are two things you should definitely do here: tell Ruby where to find your .NET assemblies and require common dependencies all tests will need.
Telling Ruby where to find your .NET assemblies is similar to telling cmd what directories to search by adding to your PATH variable. You do this in ruby by appending to the $: magic variable (aliased $LOAD_PATH):
$: << '../Src/Model/Model/bin/Debug'
There are other ways to find your .NET assembly without modifying the load path, but I kind of like this approach.Common stuff you'll want to require includes rubygems, spec, and your .NET dll you're trying to test:
require 'rubygems'
require 'spec'
require 'Model.dll'
require 'spec'
require 'Model.dll'
More on PrivateBinding
If you're writing DDD style code in .NET, you probably have internal constructors and things which you need to be able to execute with your tests. If you were testing with NUnit, you'd setup your test assembly as a friend assembly of the assembly under test. assembly.
You can't do this when you're testing with IronRuby because there IS no assembly. So instead, you have to invoke IronRuby with the PrivateBinding flag as follows:
ir -X:PrivateBinding ...
This works great for accessing your internals or privates in .NET. But sadly, there is currently a bug somewhere that causes rake to break. I posted to StackOverflow here but haven't found a solution yet. So be aware of that.
Monday, July 12, 2010
The Analysts Dilemma
What's the hardest part of software development?
Too vague? Lets make it multiple choice:
A. Architecture
B. Code design
C. Algorithms
D. Business Analysis
E. Data Structures
If you answered anything other than D then you're an idiot. Seriously, look at the title of the post! How could you NOT know that D was the right answer. This isn't some open debate, this is more like high school, and I'm the teacher on this blog, and whatever I say is the right answer is the right answer. It doesn't matter what you think! Much like the relationship an analyst has with the customer.
This is what makes analysis the hardest part of software development. You really really want everything to be lined up in nice neat logical rows so that you can build the software in nice neat modules. But those damn users just refuse to do things logically and neatly! And despite how much you try to have it your way, you just keep getting Cs and Ds. Ultimately you have to give in and just give the users what they want. Embrace the wrinkles, the complexity, and the real world.
This is the picture of the world usually painted by Agile and DDD, and it's almost correct. Because it is true:
That may sound harsh, but no matter how you cut it, it's true. Don Norman doesn't come out and say that, but he tells a story which I've seen happen first hand many times. If you ask people how they do their office work, and you write down everything they say, and then you read it back to them, they will completely agree with its accuracy. But when you go and watch them actually doing the work, you'll see that what they told you isn't what they're doing. If you ask why, the usual answer is because they are dealing with a special case. "We usually do it that way, but in this case I have to..."
So not only are we stuck analyzing something seemingly illogical that we can't force into a logical mold, we also can't rely on being given fully accurate information from the only people we can get information from! We. Are. Screwed.
And believe it or not, I can make it even more difficult for us! Because frequently the introduction of software doesn't just automate the manual process people have always performed, it actually changes the process. Meaning that as you go, you're making things that were once true, false. It's got some quantum mechanics flavor there.
So what are we supposed to do. The first thought is to try to get more accurate information up front, but this will never succeed. There will always be an edge case someone didn't think of. And trying to drill into nitty gritty details without anything solid to build on leads you to become focused on things that don't matter, and over design. Ultimately wasting time, and making it harder for you to respond to change when things inevitably do change.
Instead you have to do one, or both, of the following:
Too vague? Lets make it multiple choice:
A. Architecture
B. Code design
C. Algorithms
D. Business Analysis
E. Data Structures
If you answered anything other than D then you're an idiot. Seriously, look at the title of the post! How could you NOT know that D was the right answer. This isn't some open debate, this is more like high school, and I'm the teacher on this blog, and whatever I say is the right answer is the right answer. It doesn't matter what you think! Much like the relationship an analyst has with the customer.
This is what makes analysis the hardest part of software development. You really really want everything to be lined up in nice neat logical rows so that you can build the software in nice neat modules. But those damn users just refuse to do things logically and neatly! And despite how much you try to have it your way, you just keep getting Cs and Ds. Ultimately you have to give in and just give the users what they want. Embrace the wrinkles, the complexity, and the real world.
This is the picture of the world usually painted by Agile and DDD, and it's almost correct. Because it is true:
- The real world is complicated
- You can't dramatically simplify how your users work
- You have to make your users happy
Don Norman, author of the great book The Design of Everyday Things, talks about this in his Business of Software talk. As he says there, the real world is NOT logical. But then he goes on to talk about what makes Analysis really hard: you can't trust what your users tell you.
That may sound harsh, but no matter how you cut it, it's true. Don Norman doesn't come out and say that, but he tells a story which I've seen happen first hand many times. If you ask people how they do their office work, and you write down everything they say, and then you read it back to them, they will completely agree with its accuracy. But when you go and watch them actually doing the work, you'll see that what they told you isn't what they're doing. If you ask why, the usual answer is because they are dealing with a special case. "We usually do it that way, but in this case I have to..."
So not only are we stuck analyzing something seemingly illogical that we can't force into a logical mold, we also can't rely on being given fully accurate information from the only people we can get information from! We. Are. Screwed.
And believe it or not, I can make it even more difficult for us! Because frequently the introduction of software doesn't just automate the manual process people have always performed, it actually changes the process. Meaning that as you go, you're making things that were once true, false. It's got some quantum mechanics flavor there.
So what are we supposed to do. The first thought is to try to get more accurate information up front, but this will never succeed. There will always be an edge case someone didn't think of. And trying to drill into nitty gritty details without anything solid to build on leads you to become focused on things that don't matter, and over design. Ultimately wasting time, and making it harder for you to respond to change when things inevitably do change.
Instead you have to do one, or both, of the following:
- Teach your [users, customers, product owners, domain experts, etc] about the software side of things and get them intimately involved in the design and development of every aspect of the software. From architecture to process to UIs.
- Aggressively shorten the feedback loop in any way possible. Get your designs, prototypes, early implementations, betas, and releases in the hands of users and make them work with them as quickly as you possibly can.
This is why it is so so so important to write agile code! "Agile" code is code which is easy to change, to some degree of easy. Once we embrace the fact that the hardest part of software development is analysis, and that truth be told analysis is basically impossible, we realize the most important thing for our code is to be able to respond to change. This has some dramatic implications on how we approach code: understanding becomes more important than execution or writing speed. This is why DDD, BDD, and SOLID are so important!
In the end, we have to stop thinking of analysis as something that happens once at the beginning of a project. Instead we have to minimize how much time we spend up front, and actually use our code as a tool to help figure out what the customer actually needs the software to do. We have to get to the point where learning something new from a [customer, user, product owner, domain expert, etc] doesn't cause us to grumble and complain about how no one ever tells us the right stuff. Its time we owned up to the fact that this is how the real world works, and stop moaning about it, and start expecting it, and finding ways to turn it to our advantage.
In the end, we have to stop thinking of analysis as something that happens once at the beginning of a project. Instead we have to minimize how much time we spend up front, and actually use our code as a tool to help figure out what the customer actually needs the software to do. We have to get to the point where learning something new from a [customer, user, product owner, domain expert, etc] doesn't cause us to grumble and complain about how no one ever tells us the right stuff. Its time we owned up to the fact that this is how the real world works, and stop moaning about it, and start expecting it, and finding ways to turn it to our advantage.
Wednesday, June 30, 2010
Powershell: Add an extension to every file in a directory
It's been awhile since I've posted up a Powershell script. Powershell really is great, and I really don't use it enough. I should start using it all the time for any ridiculous thing I can think of just so I can polish up my skills so when that once in a blue moon, "holy crap! do something complicated quick!" situation comes up I'll be ready...
This time I downloaded a bunch of mp3s that were on Google docs. Google docs is awesome, it lets you select a bunch of files, and download them all. Then it zips them up so you can download them all together. Unfortunately, when I unzipped them, they didn't have a file extension...
Beautifully simple.
Need to figure out what properties are available on the objects you get from the dir command?
This time I downloaded a bunch of mp3s that were on Google docs. Google docs is awesome, it lets you select a bunch of files, and download them all. Then it zips them up so you can download them all together. Unfortunately, when I unzipped them, they didn't have a file extension...
dir | % { mv $_.FullName ( $_.Name + ".mp3" ) }
That command gets every file in the directory ( dir ), sends them to the next command ( | ), which loops over them one at a time ( %, is short hand for foreach-object ), then executes the move command ( mv ) with the file's full name as the first argument ( $_.FullName, $_ is magically set to the current loop item ), and the file's name with .mp3 tacked onto the end as the second argument ( $_.Name + ".mp3" ). The parentheses tell it to evaluate the expression and pass the result as the second argument.Beautifully simple.
Need to figure out what properties are available on the objects you get from the dir command?
dir | get-memberNeed to figure out what the value of one of those properties will actually be?
dir | % { $_.FullName }
Gotta love Powershell!
Monday, June 28, 2010
Simplicity vs. Adaptability
When dealing with code, and code architecture and design, there are lots of factors which have to be weighed to determine what the best way to go is. The popularity of Ruby on Rails in the blogosphere and the conference circuit has begun to shift the conversation about these factors a bit. When we were focused on Java and .NET we spent lots of time talking about The Gang of Four, Fowler, SOLID, and SOA. These days we seem to be talking more about BDD, "simplicity", terseness, and productivity.
I think this is a good thing, but I also think it's because we have finally realized that we are writing a whole new class of applications now. Back in the day, people were focused on BIG and COMPLICATED applications for banking and shipping and other complicated industries. We are still doing that kind of work today of course, but we've added a whole new class of application that didn't exist before: small and simple web applications. These websites are actual applications, not just brochure sites, so they have logic and models and all the rest. But their domain tends to be small, and the rules tend to be much simpler.
It seems like what we've learned as an industry is that all the patterns and practices that have been developed for dealing with large and complicated systems aren't necessarily needed for smaller web applications. But many of these things have become so ingrained in the way that we think and the way we approach problems that it can be a rather jarring shift to throw them out.
Ultimately this comes down to a question of Simplicity vs. Adaptability.
Simplicity: straight forward, few layers
Adaptability: Guards against change, includes more abstraction
Are These Really Opposing Characteristics?
The best possible design would be one that is both simple and adaptable, but usually simplicity and adaptability are opposing characteristics. This is because to make something adaptable, you tend to have to build in more layers and more abstraction, and that necessarily makes it more complicated.
For example, the Active Record pattern is simpler than the Data Mapper pattern. But the Data Mapper pattern isolates your models from changes in the database and vice versa, as well as removing all persistence knowledge from the models themselves.
There is a fun catch-22 here though. If your abstractions can serve as effective methaphors, you can begin to ignore the complexity the abstraction hides from you. This allows you to think about the system in a much simpler way, even though the details of it are very complex. But its debatable whether we would call a system like this "simple." For example, websites written with Ruby on Rails tend to be simple, but I would not describe Rails itself as simple.
Simplicity is exemplified by DHH, creator of Ruby on Rails
Adaptability is exemplified by Jeremy D Miller, creator of FubuMVC
Just look at the difference in their tag lines:
"Ruby on Rails is an open-source web framework that's optimized for programmer happiness and sustainable productivity."
FubuMVC: "Compositional, compile safe, convention-based configuration for complex web applications."
The focus of these two projects is clearly different, and its hard to argue with either one. Both can be used to create "simple" web applications. But Rails is very focused on a certain subset of simple web apps, where Fubu is more interested in being adaptable in order to allow you to tailor it to your needs.
There is a trade-off here. And which trade is right for your project is one of the most important decisions you have to make. As I wrote recently, the factors you have to consider in this trade-off frequently aren't even technical! So we really need to understand how diverse our industry has become, and we need to understand the context in which these things are being discussed.
Wednesday, June 23, 2010
More Engagement
Are you engaged at work? I don't mean like, engaged to be married. I mean are you ENGAGED?
If not, why not? Are you not paid well? Do you not like the work? Do you not like your coworkers? Do you not like the management? Do you not like your boss? Are you bored? Is the organization keeping you from doing your best work? Do you feel unproductive? Do you feel unable to contribute? Do you feel micromanaged? Why?
Maybe this guy can help explain it:
Maybe your organization's attempts to motivate you are actually, though accidentally, de-motivating you! Isn't it ironic? So what should they be doing? What do you need to be engaged at work? To be at your best, to be your most productive, to be happy?
According to Dan Pink, its as simple as 3 words: Autonomy, Mastery, and Purpose. This observation lines up so nicely with the observations in First, Break All The Rules too! But the problem with these observations is that they are only observations. They provide really useful information, but don't help your boss figure out how he should act day to day.
That's actually one of the main reasons why I think First, Break All The Rules is such an excellent book, and so much better than any other book on "management" I've ever read. There simply is no cookie cutter, one size fits all, set of steps you can follow to create an environment in which everyone is engaged.
Now, if you're the boss you can start putting your understanding of Autonomy, Mastery, and Purpose to work right now. But if you are not the boss, what good is this to you? If you have some "direct reports," you can apply these ideas with them within your own projects. At least that's a start. But if you don't have "direct reports," then it looks like knowing this stuff isn't going to help you at all!
I remember the first time I read Peopleware... On the one hand I absolutely loved it. But on the other, it was just depressing. And I believe that my company would actually rank very highly compared to other companies on these factors. But it didn't matter, it was depressing! That's because this kind of stuff seems too far out of the control or influence of a lowly programmer. What impact can one person have on things like culture? Or the autonomy granted to employees? Or the purpose behind the work? Or big-M vs. little-m methodologies?
I think the answer to these questions is, quite frankly, that on your own you can have very little impact. BUT! I believe a group of like minded people, with common goals, patience, and a dash of determination can get together within any organization and make a huge difference. Those people can become engaged in the struggle to be engaged at work! And, as cheesy as it may be, any group of people starts with just one person. To be successful at this you have to have a broad idea of where you're heading. If all you want to do is be like 37 Signals, you're out of luck. But if you can embrace the real kernel of truth in all the observations found in so many places these days (including what the guys from 37s are saying), and tailor that to your organization's unique goals, strategy, and personality... Then you will be able to make your company, or your division, or even just your team one that all your friends will be jealous of. And everyone will end up doing better work because of it.
So if you're not engaged at work, stop looking up! Stop waiting for someone else to change! Get out there and get started making a difference today. Even if just a small one. Be prepared to lose a lot of battles, but don't let one set back prevent you from continuing to work at it. Because it is worth working at. And you don't have to get 100% of the way there. You don't have to end up with a ROWE to be engaged at work. That's because lots of little improvements will add up. And could maybe even start a steam rolling effect.
The key is to recognize that it's worth striving for, that you don't have to keep looking up waiting for someone else to make the changes, and that its ultimately a community effort.
If not, why not? Are you not paid well? Do you not like the work? Do you not like your coworkers? Do you not like the management? Do you not like your boss? Are you bored? Is the organization keeping you from doing your best work? Do you feel unproductive? Do you feel unable to contribute? Do you feel micromanaged? Why?
Maybe this guy can help explain it:
Maybe your organization's attempts to motivate you are actually, though accidentally, de-motivating you! Isn't it ironic? So what should they be doing? What do you need to be engaged at work? To be at your best, to be your most productive, to be happy?
According to Dan Pink, its as simple as 3 words: Autonomy, Mastery, and Purpose. This observation lines up so nicely with the observations in First, Break All The Rules too! But the problem with these observations is that they are only observations. They provide really useful information, but don't help your boss figure out how he should act day to day.
That's actually one of the main reasons why I think First, Break All The Rules is such an excellent book, and so much better than any other book on "management" I've ever read. There simply is no cookie cutter, one size fits all, set of steps you can follow to create an environment in which everyone is engaged.
Now, if you're the boss you can start putting your understanding of Autonomy, Mastery, and Purpose to work right now. But if you are not the boss, what good is this to you? If you have some "direct reports," you can apply these ideas with them within your own projects. At least that's a start. But if you don't have "direct reports," then it looks like knowing this stuff isn't going to help you at all!
I remember the first time I read Peopleware... On the one hand I absolutely loved it. But on the other, it was just depressing. And I believe that my company would actually rank very highly compared to other companies on these factors. But it didn't matter, it was depressing! That's because this kind of stuff seems too far out of the control or influence of a lowly programmer. What impact can one person have on things like culture? Or the autonomy granted to employees? Or the purpose behind the work? Or big-M vs. little-m methodologies?
I think the answer to these questions is, quite frankly, that on your own you can have very little impact. BUT! I believe a group of like minded people, with common goals, patience, and a dash of determination can get together within any organization and make a huge difference. Those people can become engaged in the struggle to be engaged at work! And, as cheesy as it may be, any group of people starts with just one person. To be successful at this you have to have a broad idea of where you're heading. If all you want to do is be like 37 Signals, you're out of luck. But if you can embrace the real kernel of truth in all the observations found in so many places these days (including what the guys from 37s are saying), and tailor that to your organization's unique goals, strategy, and personality... Then you will be able to make your company, or your division, or even just your team one that all your friends will be jealous of. And everyone will end up doing better work because of it.
So if you're not engaged at work, stop looking up! Stop waiting for someone else to change! Get out there and get started making a difference today. Even if just a small one. Be prepared to lose a lot of battles, but don't let one set back prevent you from continuing to work at it. Because it is worth working at. And you don't have to get 100% of the way there. You don't have to end up with a ROWE to be engaged at work. That's because lots of little improvements will add up. And could maybe even start a steam rolling effect.
The key is to recognize that it's worth striving for, that you don't have to keep looking up waiting for someone else to make the changes, and that its ultimately a community effort.
Thursday, June 3, 2010
Software Craftsmanship
Growing and Fostering Software Craftsmanship from Cory Foy on Vimeo.
"Software Craftsmanship" is kinda sorta like the next "Agile." As a "movement" its not really very interesting. But if you ignore the proselytism you discover that the message is both simple and appealing. Cory Foy does an excellent job of communicating that in this presentation. So much so that I wanted to share it.
Movements tend to fall short because they fail to convey the context of the problems they have been created to solve. Cory does a great job of providing the background of what the typical problems in software engineering are and follows through with clear, but broad, ideas to fix them. This isn't a talk about unit testing, TDD, pair programming, or any other specific techniques. Instead its a talk about the nature of the problem and what the solutions should look like and how you could begin to move in the right direction.
In short this is an inspirational talk. Watch it! I think you'll enjoy it.
Monday, May 24, 2010
Rails has no place at the office
This is a milestone post for me! My first ever purposefully incendiary title!
I should probably run with it and try to get everybody super offended, to the point where you have no idea what my point is because all you can see is red. I guess I'll have to leave that for a future milestone...
Because, yeah, I'm not really serious. Rails has a place at the office. And no, this isn't going to be one of those "Is Rails ready for the Enterprise?" posts. Rails is perfectly ready for use in the Enterprise, but that's the wrong question. As usual, the right question is much more complicated.
To start with, let me point out that this conversation has nothing to do with Ruby vs. C#. It doesn't really have anything to do with Rails vs ASP.NET MVC either. Instead I'm going to be talking about Active Record vs. Data Mapper, and View-Models vs. no View-Models, and this general concept of "the straight and narrow" vs. explicit abstraction and control. These are design patterns which apply to any language and appear in many different frameworks.
Rob Conery recently wrote a blog post in which he said,
Rails has a similar story. It uses the Active Record pattern for its data access, which requires a 1-1 correspondence with your database. Further, the models don't even really exist! They're built dynamically from the schema of the database tables.
If you compare ASP.NET MVC to Rails one of the differences you'll quickly discover is this concept of a "View-Model". ASP.NET peeps seem to like these, whereas I haven't found a Rails sample anywhere that uses these. Both Active Record and this lack of a View-Model are accomplishing the same thing: removing abstraction in favor of directness and simplicity.
Now lets step back from this for a second and ask a question. Who in there right mind would want to have to deal with things like class and method visibility and extra layers of abstraction, which more often than not appear to be just duplication? No one! No one would want to deal with these things! It's extra work! I _hate_ extra work!
So why do we do it? Why does DDD make a big deal out of private constructors and Factories? Why does Fowler recommend the Data Mapper pattern over Active Record? Why do we create View-Models to separate our Views from our Models? Why do we do all these things that seem to just make life more complicated? Why don't we all take the straight and narrow path on all of our projects all of the time?
Certainly it's not as simple as the language we're using. Just because you're writing in C# and Java doesn't mean you can't use Active Record. And it doesn't mean you can't pass your Model straight to your View. There is also nothing about C# or Java that forces you to use interfaces, or follow the Dependency Inversion Principle. That said, there's also no reason why you couldn't use the Data Mapper pattern in Ruby, or create View-Models. The language certainly HELPS with some of these issues, but it's not the real difference. These are just patterns, and they apply equally well to any language.
The reason why we introduce this complexity and divert from the straight and narrow path in our technical approach is actually due primarily to non-technical reasons. Here are some of the reasons I think lead us to adopt these "enterprise" patterns:
These are not technical issues but they have technical IMPLICATIONS!
The practices prescribed by DDD are a big deal if you're working with a large complicated domain with lots of potential for change. If you're not, then you don't need DDD. Fowler's enterprise patterns are a big deal for the same reasons. If you know things are complicated, likely to change, and not possible for everyone on the team to grok completely, then you need to build abstraction into your code. And you need to try to be as explicit as you possibly can about what the code does and how it works. And you need to look for opportunities to prevent error and misunderstanding before it happens. These things will allow you to keep things clean, organized, and ultimately make your project successful when you're faced with "enterprise" challenges.
This is obvious. I'm sure you're sitting there (or standing) thinking, "duh!" or "when is this dude going to get to the point?" or "does this moron really think this is revolutionary?!"
My point is as simple as this. Rails is awesome. Simplicity is awesome. But as I sit here in my ivory tower looking out over the landscape I see lots of quiet subtle backlash from people against the "enterprise-y" patterns in favor of the simplicity of Rails. This makes a lot of sense to me because, as we pointed out, who would WANT to deal with the complexity of enterprise problems and patterns? But it is easy to be tempted by the appeal of simple solutions to simple projects. And certainly we should always strive to find the simplest solution that could possibly work. But we can not close our eyes to the complexities of the problem or the environment in which we are solving the problem. And we cannot allow ourselves to be boiled alive either.
So by all means, choose the right tool for the right job, but make sure you understand the job as well as you understand the tool.
I should probably run with it and try to get everybody super offended, to the point where you have no idea what my point is because all you can see is red. I guess I'll have to leave that for a future milestone...
Because, yeah, I'm not really serious. Rails has a place at the office. And no, this isn't going to be one of those "Is Rails ready for the Enterprise?" posts. Rails is perfectly ready for use in the Enterprise, but that's the wrong question. As usual, the right question is much more complicated.
To start with, let me point out that this conversation has nothing to do with Ruby vs. C#. It doesn't really have anything to do with Rails vs ASP.NET MVC either. Instead I'm going to be talking about Active Record vs. Data Mapper, and View-Models vs. no View-Models, and this general concept of "the straight and narrow" vs. explicit abstraction and control. These are design patterns which apply to any language and appear in many different frameworks.
Rob Conery recently wrote a blog post in which he said,
For a lot of .NET/Java devs this will look "messy" - you shouldn't elevate "data concerns" into your model. This argument makes good sense for a large, complex site - that you're building in C# or Java. Typically Ruby focuses on the straight, narrow path and with that comes a dramatic turn towards "doing what you need to do... and no more". This resonates with me...The part about Ruby/Rails focusing on "the straight and narrow path" really struck a chord for me. Ruby, being a dynamic language, is very much on the "straight and narrow." It dispenses with all kinds of things found in strongly typed languages like private, internal, protected, interfaces, etc. These are things that are usually considered very important in a strongly typed language, and practices like DDD, but Ruby doesn't really bother with them. Ruby favors documentation and convention over strict control.
Rails has a similar story. It uses the Active Record pattern for its data access, which requires a 1-1 correspondence with your database. Further, the models don't even really exist! They're built dynamically from the schema of the database tables.
If you compare ASP.NET MVC to Rails one of the differences you'll quickly discover is this concept of a "View-Model". ASP.NET peeps seem to like these, whereas I haven't found a Rails sample anywhere that uses these. Both Active Record and this lack of a View-Model are accomplishing the same thing: removing abstraction in favor of directness and simplicity.
Now lets step back from this for a second and ask a question. Who in there right mind would want to have to deal with things like class and method visibility and extra layers of abstraction, which more often than not appear to be just duplication? No one! No one would want to deal with these things! It's extra work! I _hate_ extra work!
So why do we do it? Why does DDD make a big deal out of private constructors and Factories? Why does Fowler recommend the Data Mapper pattern over Active Record? Why do we create View-Models to separate our Views from our Models? Why do we do all these things that seem to just make life more complicated? Why don't we all take the straight and narrow path on all of our projects all of the time?
Certainly it's not as simple as the language we're using. Just because you're writing in C# and Java doesn't mean you can't use Active Record. And it doesn't mean you can't pass your Model straight to your View. There is also nothing about C# or Java that forces you to use interfaces, or follow the Dependency Inversion Principle. That said, there's also no reason why you couldn't use the Data Mapper pattern in Ruby, or create View-Models. The language certainly HELPS with some of these issues, but it's not the real difference. These are just patterns, and they apply equally well to any language.
The reason why we introduce this complexity and divert from the straight and narrow path in our technical approach is actually due primarily to non-technical reasons. Here are some of the reasons I think lead us to adopt these "enterprise" patterns:
- There are more than two or three developers on the project
- You have more than 6 entities in the domain
- The project has a timeline longer than 3 months
- The developers aren't intimately familiar with the domain
- The project is likely to grow in fits and starts
- The team members are more likely to come and go
These are not technical issues but they have technical IMPLICATIONS!
The practices prescribed by DDD are a big deal if you're working with a large complicated domain with lots of potential for change. If you're not, then you don't need DDD. Fowler's enterprise patterns are a big deal for the same reasons. If you know things are complicated, likely to change, and not possible for everyone on the team to grok completely, then you need to build abstraction into your code. And you need to try to be as explicit as you possibly can about what the code does and how it works. And you need to look for opportunities to prevent error and misunderstanding before it happens. These things will allow you to keep things clean, organized, and ultimately make your project successful when you're faced with "enterprise" challenges.
This is obvious. I'm sure you're sitting there (or standing) thinking, "duh!" or "when is this dude going to get to the point?" or "does this moron really think this is revolutionary?!"
My point is as simple as this. Rails is awesome. Simplicity is awesome. But as I sit here in my ivory tower looking out over the landscape I see lots of quiet subtle backlash from people against the "enterprise-y" patterns in favor of the simplicity of Rails. This makes a lot of sense to me because, as we pointed out, who would WANT to deal with the complexity of enterprise problems and patterns? But it is easy to be tempted by the appeal of simple solutions to simple projects. And certainly we should always strive to find the simplest solution that could possibly work. But we can not close our eyes to the complexities of the problem or the environment in which we are solving the problem. And we cannot allow ourselves to be boiled alive either.
So by all means, choose the right tool for the right job, but make sure you understand the job as well as you understand the tool.
Subscribe to:
Posts (Atom)

