Tuesday, November 13, 2012

Encapsulation: What the devil is it?

I love the word 'Encapsulation.'  It's a big fancy word and I feel smart when I use it.  Unfortunately, I'm not really sure what it means, and neither is Wikipedia.  "Encapsulation is to hide the variables or something inside a class."  I lol'd when I read "or something," what a specific definition!  So, what the devil is it?

The most naive OO definition might be:
A language feature that bundles data and methods together.
You might extend that to say that it hides the data from public consumption, but that part muddies up the water, as the Wikipedia article demonstrates.  My favorite example of Encapsulation is a Queue class.  You get push, pop, and peek operations to call, but you don't know what data structure the uses to implement those operations.  It could be an array, it could be a linked list, whatever.  In this we can easily see the beauty and the power of encapsulation: "data" and "methods" together.

But wait, what did I actually encapsulate in that queue?  Was it the "data"?  I pass my data into and out of the Queue, and the Queue hides its implementing data structure from me.  Maybe it's a Queue<string>, and I'm all: q.Push("encapsulation"); q.Push("is"); q.Push("about"); q.Push("data"); Assert.AreEqual("encapsulation", q.Pop());  For me the data are those strings, but those strings are clearly not what the Queue is encapsulating!

That word "data" in our definition is a tricky one.  It can be applied to too many things to be really useful.  But does replacing "data" with "data structure" in the definition fix the problem?
A language feature that bundles data structures and methods together.
It clears it up, but it introduces another problem.  For example, what if my object is a database gateway?  Certainly there's a data structure somewhere in that database, but my object isn't directly encapsulating that!  No, it's probably "encapsulating" ADO.NET procedural calls, or some other data access library.  The procedural calls are neither data nor data structure...  So could it be that thinking about data is completely misleading?
A language feature that bundles implementation details and methods together.
This is a rather large step though!  Instead of just talking about data, or data structures, this now includes just about anything in the definition of things that can be bundled with methods to achieve encapsulation!  Maybe there is some value in restricting what the word "encapsulation" applies to, but if there is, I doubt it's something that is going to prove useful for Software Engineers.  So while I admit this definition could be a perversion of the word "encapsulation," I find it more useful.

The other definition Wikipedia gives for encapsulation, which I've neglected to tell you about until now, is "A language mechanism for restricting access to some of the object's components."  This is more similar to the definition I just ended on.  I take some issue with the word "restricting" and the word "components" is ambiguous enough to be a problem.  But I don't think it's a stretch to think "components" could include both data and dependencies.

So, perhaps we've arrived at a better understanding of encapsulation.  One that recognizes that data is not the all important concept.  The next step I'd like to take is to extend this slightly deeper understanding outside the realm of data structures and into more typical "business" scenarios.  That will be the next post.

Thursday, September 13, 2012

I'm Not Trendy

In October of 2011 we had a Burning River Developers meetup that I unfortunately missed (because I was on a plane returning from Europe!).  I learned from people who attended that during his presentation Dan Shultz said that I am "not trendy".  I guess he was saying something about how Knockout is not as trendy as Backbone, and that I'd appreciate that.

Here's the thing, he's right!
When I bought my mac, it was probably one of the few examples of a time where I allowed the popularity and trendiness of something to sway to my decision.  And I've given it lots of time and lots of patience, but last night I finally admitted that OSX is rubbish.

Why?
  • Everything about Finder is broken.  EVERYTHING.
  • Window management is garbage.  How do you minimize and restore w/ out the mouse? (I know about command-H, it's no good either)
  • The only app I like is chrome.  Guess what, it runs on every OS!
  • mail.app is sluggish and ugly
  • iCal can't sync correctly with google
  • iPhoto doesn't scale and doesn't organize in a way I find useful
  • iTunes music file management is totally inflexible, to the point of unusable
  • The dock is dramatically inferior to Windows 7's task bar
  • Lion is slow to boot, and has worse battery life
  • I see no compelling reason to upgrade to Mountain Lion at all (except to remove the stupid skeuomorphic stich graphic from iCal, which as mentioned above, I don't use anyway)

The only thing I like is the swipe motion to switch spaces.  But the only time I use that is when I'm running windows in a VM, so if I was in windows I WOULDN'T NEED IT.  The only other rare time I use it is to maximize a window, which just goes back to how crappy window management is on OSX.

I thought I might like that it was a unix, but the truth is it's a crappy incompatible unix (and it doesn't even have a decent package manager that doesn't break across upgrades).  And any time I've wanted to do something unix-y, it's just been a headache.  

The hardware is pretty though!  And the trackpad is the best I've ever used, so it's certainly got that.  Also the keyboard is pretty great.  So bully to the hardware.  But, I'm being honest here, that's all I can give it.

So why is the mac so popular right now?  Just because it's popular.

Even as I write this, I'm aware of how unpopular this opinion is.  But it's just my opinion, and you should be skeptical of it.  Especially because like Dan said, I'm not trendy.  

Everyone likes Dynamic Languages?  I prefer static and functional.
Everyone's excited about Node?  Looks like an immature waste of time to me.
Backbone is all the rage?  I'll stick with Knockout, thanks very much.
IPhone?  "Meh" at best, I'm happier with Android.
Pair programming?  Code reviews.
Startups want to cash out quick?  Nonsense, build a sustainable business that tackles hard problems and makes an actual difference!
Pop music? Techno? Dubstep?  It's jazz for me.  And even within jazz, I don't like bee-bop (arguably the most popular right now), so I'm totally fringe there too.

This is totally a rant, and if I have a point at all, it's this:  1) I'm not trendy 2) OSX is garbage.  And I guess my other point is just that it's OK to not get caught up in the trends.  You don't have to join the Lemmings, you're allowed to have you own taste and opinions!

I reserve the right to change my mind about all opinions contained in this blog post at any time without notice.

Thursday, August 30, 2012

Opinionated

o·pin·ion·at·ed
Conceitedly assertive and dogmatic in one's opinions.

con·ceit·ed
Excessively proud of oneself; vain.*

I'm not sure who first used the word "opinionated" to describe a software framework in a positive light, but I know the place I encounter it most often is in reference to Rails.  The good news is, I bet DHH fully understands the meaning of these words, and is still more than happy to identify by them.  At least he knows what he's doing.  But I still think this is a stupid thing to be proud of.

Unfortunately, writing "opinionated" code has since just become the thing to do.  And I don't think most people who have jumped on that bandwagon bothered to look it up in the dictionary first.  As a result, I now have the impression that when someone says their framework is opinionated what it really means is they're claiming it's general purpose, but it really only works in a very specific scenario, and they don't even understand that other scenarios exist.

Imagine you're walking down the aisles at Home Depot looking for a tool to help you complete a job.  Maybe you need a drill bit extension to reach into a tight corner.  You have a certain kind of drill that accepts standard shaped bits, and your corner has certain dimensions, and you need to set a certain diameter screw.  A helpful sales associate comes up and asks if he can help.  You give him a quick high level summary, and he smiles knowingly.

"What you need is the XYZ fixed flexible dongle attachment!  It's the only choice.  All the other options are total bullshit, I can't believe we even stock them."  He's kind of starting to foam at the mouth now...
"Believe me, this is the one."

OK, you think.  It costs 3x as much as the other items on the shelf, but this guy clearly knows what he's talking about.  You get it home and come to find, it doesn't fit your drill because it has a non-standard bit shape, it doesn't fit your bit because it's built for larger bits, and it's too short to reach into the corner anyway.  And what does the guy say when you take it back to return it?

"Oh, well, this is an opinionated drill bit extension, it's not meant for your job."

Code that does one well understood, well defined thing, is exactly what I want.  But misunderstanding that well defined thing and advertising the code as the solution for everything is stupid.  And being dogmatic and conceited about it isn't helping anyone.

* Definitions from Google

Sunday, August 19, 2012

Blogs are Little Islands

You are not blogging enough. You are pouring your words into increasingly closed and often walled gardens. You are giving control - and sometimes ownership - of your content to social media companies that will SURELY fail.  - Scott Hanselman, Your words are wasted
I enjoy blogging.  I've been doing it since April of 2007 (that's 5 years at the time of this writing!).  For me, it's a great way to work through problems and ideas.  It's kind of a "learning out loud" thing.  And lately, it's been just a way to have some fun with writing.  That's why I'm still doing it, but it's not why I originally started doing it.

I first got into blogging because I wanted to be a part of the community of tech people who were on the interwebs learning from each other and arguing with each other.  That didn't happen, because it turns out a BLOG is not a community.

Blogs are little islands, owned by little dictators.  They've all got large towers built right in the center with mega phones mounted on top, and they're shouting out to sea.  

There's this weird aggregator of shouts out there somewhere, we call it Google.  It archives your shouts, so people searching for a solution to a problem can have a chance of finding the echo of something you yelled long ago.  Of course, that echo has been bouncing around for awhile, and it's probably not terribly accurate any more.  Because of that, we don't shout solutions to problems any more, we do that on StackOverflow now.

But we're all still shouting, so it must be because we want someone to shout back.  But if you even hear my shout, and if you do bother to shout back, the chances I'll hear it are slim.

So instead, maybe you'll fly by my island and drop a leaflet on the beach.  I might pay attention to that, and if I do, I'll leave a leaflet for you on my beach in response.  But you'll never see it, it's my beach and you're not there.

If we're really going to talk, you'll have drive your boat over to my island and stay awhile.  But what a big decision that is for you!  Why would you spend your tourist dollars on my island when there are so many other islands to choose from?  And some of them are much larger, and have many more tourists!

Every island starts out abandoned, with just a lone dictator.  If that dictator is willing to shill for tourists through aggressive marketing, he might attract a bit of a crowd.  But the dictator will still be the dictator and the tourists just tourists.  That's not a great format for interesting conversation...

The little islands model just isn't conducive to building community and having great conversations.  Twitter isn't either, but for different reasons.  And Facebook?  Well it's Facebook.  G+?  *cricket cricket*.  If there's a solution to this problem I don't know what it is.  But I'm pretty sure the solution isn't blogs.

Steve Jobs on Experience

"A lot of people in our industry haven’t had very diverse experiences. They don’t have enough dots to connect, and they end up with very linear solutions, without a broad perspective on the problem. The broader one’s understanding of the human experience, the better designs we will have.” - Steve Jobs, Wired, February, 1996

Monday, August 6, 2012

Chores

I was thinking about chores today.  Maybe a strange thing to spend much time thinking about, sometimes your mind wanders to weird stuff when you're not paying attention.  Anyway, I was thinking about chores; specifically household chores.  You know, stuff like:
  1. Vacuuming
  2. Doing the dishes
  3. Unpacking the groceries
  4. Taking out the trash
  5. Putting stuff away
These chores kind of fall into a few different categories:
  1. Regularly recurring (trash)
  2. Sporadically recurring (groceries, dishes)
  3. Uncompelled recurring (vacuuming, putting stuff away)
With regularly recurring you have to take out the trash on trash day, every trash day, on the same day every week.  Sporadically recurring is, well, sporadic.  Some nights there aren't any dishes to do.  But when there are dishes to do, those dishes have to be cleaned.

But uncompelled recurring is different.  There's no fixed external requirement that forces you to vacuum the floors, or clean all the stuff off the coffee table.  You could do these things on a regular schedule, but that would simply be your option, it's not an innate requirement.  And unlike the sporadically recurring chores, the line at which the chore must be done (the dishes are dirty) is not as clear (the floor is dirty?).  How dirty does the floor have to be before I *have to* vacuum it?  How much stuff must be laying on the coffee table before I clear it off?

I'd like to illustrate another interesting thing about the uncompelled recurring category with a story.  When I was growing up, my dad used to harp on my brother and I about putting stuff away after we'd used it, especially tools.  This was one of those classic dad things, he NEVER put his tools away, but he'd be on our case to clean up our stuff.  One day we'd just finished some project around the house, and he was going into the whole "lets get this cleaned up" routine, but then he did something different than usual and explained why he was on our case about it.  He said was that he personally had the bad habit of leaving stuff out, which not only meant stuff was cluttered but also meant he could never find anything when he needed it, and he hoped that he could instill in us a better habit, which he wished he had himself, of keeping everything it's right place, so we wouldn't have the same trouble. And to this day, I'm pretty fastidious about putting stuff back where it belongs, especially tools.

I take a couple of things away from that.  One is, explaining your motivation can be a more persuasive and effective method than just telling people to do something.  But the one that's relevant to this discussion is that with uncompelled recurring chores, you don't have to wait for the chore to pile up and do it all at once, you can proactively do a little bit of the chore a lot.

Can you believe I just wrote a whole blog post about chores?  Ridiculous!  But my point is really simple, keeping code clean is a lot like keeping a house clean.  It's a chore, and different parts of it may fall into the same categories.  But I think it's clear that most code related chores are of the uncompelled recurring category.  That means there is no clear event at which the chore must be done (like trash day).  And there is no obvious state which forces your hand (like dirty dishes).  Which means it's all discipline.  But also means you can do a little bit all the time and stay fairly well on top of it.

To be honest, I think this a better metaphor than "technical debt."  Which is really too bad, because I hate doing chores.

Monday, July 2, 2012

The Illusion Of Simplicity

If there is one thing I've learned in my 11 years (holy crap 11 years?!) of developing software professionally it's that nothing is simple, not even the simple things.

I've come to understand that a big part of the reason for this has to do with the way our brains work.  We're capable of holding different and inconsistent mental models of the world in our heads at the same time, switching back and forth between them, and we're not even aware of it.  This is why when you ask a client, "how do you do <thing>?", they say "we do x, y, z", but later you find out that what they actually do is more like "if a then x, y, z; if b then y, z; if c then x, z; if d then w".

Sometimes we think our job is to discover and implement this complexity.  But our job is actually more than that.  Yes, we need to discover and model all that complexity, but our most important job is to then hide it away behind a simple facade, maintaining our user's illusion of simplicity.

And lest you blame this on stupid users, we suffer from the same problem when implementing algorithms!  At least I do.  This is one of the things I struggled with in solving the Word Ladder problem.  I kept over simplifying the problem, which presented itself when my attempted solutions ran into some condition it didn't account for.  Actually, I was lucky those solutions didn't work.  Sometimes a solution that doesn't really understand the problem does work, but as the problem changes overtime, that solution rapidly degrades.  It's the same issue, I allowed my illusion of simplicity to cloud the true depths of the problem.


Embrace the user's illusion of simplicity!  Fight your own!

Monday, June 25, 2012

Language Envy

At work, I build software in C#.  At home I play with languages like Ruby and F#.  I still believe C# is an amazing language.  It has seen enormous advances in .NET 3.5, 4, and the upcoming 4.5.  And I suspect alot of the people who claim they don't like it probably haven't used it since 2.0.

But for how wonderful it is, there are still lots of great things about some of the other language paradigms out there, and I definitely have a bit of language envy for some of their features.  I'll list some of these features, in context with why I think they're useful.  And though I'm not a language designer, I'll also mention how I could see C# accomplishing some of this.  Eric Lippert eat your heart out!

Dynamic Envy:
Constructor Stubbing
I'm a big believer in TDD, which requires object mocking.  But mocking in a static language can be annoying because it requires:
  1. Defining an interface for every class that must be mocked
  2. Injecting an instance of the interface into the objects that use them
This is why IoC is so popular.  And while this is annoying (I've written enough about this in the past), it works.  One of the most painful things about this, at least for me, is that you can't call a constructor, which precludes simple code like: var thing = new Thing(otherThing);  Instead, a factory class has to be introduced: IThingFactory.New(otherThing) : IThing.  And you inject an instance of IThingFactory into the class that wanted to call Thing's constructor.  UGH!

But in a dynamic language, none of this is necessary.  Anything can be mocked, and the most paradigm shifting example of this is mocking the constructor of a class so it returns a stub!
MyClass.Stub(:new) {...}
Something like this can be accomplished in .NET through IL re-writing, as seen in the new Fakes (previously Moles) from MS in what they call Shims.  But I haven't used this yet, because when it was Moles, it required the code to be run in an "instrumented" process, which couldn't be accomplished with the NUnit GUI.  I think this is still true in Fakes.  But I hope this Fakes framework keeps getting some attention, because this is just what I've always wanted:  To be able to tell my runtime, "Instead of the class name "MyClass" actually meaning "MyClass", I want it to mean "MyStubClass" for the duration of this test."

Partial "Compilation"
Sometimes when doing TDD or refactoring you'll want to change the public API of a class.  Maybe by changing a method name, or the parameters a method takes, etc.  If you have many calls to that method, this will cause lots of compiler errors in a static language.  And this makes it impossible to TDD out the API change without fixing all the calls first

Dynamic languages don't suffer from this problem because they don't have a compilation step.  Only the files you load at execution time are interpreted, and only the lines that are actually executed will fail due to api signature issues.  This can be a blessing or a curse, but in the example I gave above it's a blessing.

I would love to be able to tell my compiler, "I'm only going to be executing the tests in this one file, so just compile that file and it's dependencies and leave everything else alone, K? Thx!"

Not only would that allow me to update my APIs calls one at a time, running tests along the way, but it might also speed up the code -> compile -> test loop!

Sentinel Values
Gary Bernhardt used this technique in the Sucks/Rocks series in Destroy All Software.  It's a technique for dealing with null, but it's not the same as the Null Object pattern.

As an example, lets say you were implementing a solution to the Word Ladder problem.  What should the "Ladder FindLadder(string startword, string endword)" method return when it doesn't find a ladder?  In C#, it would return null.  The only allowable values for an object of type "Ladder" are a Ladder instance or null.

Since a dynamic language infers types at runtime, the method doesn't have a typed return value, so you can return anything you want.  The Sentinel Value technique takes advantage of this and instead of returning nil, it returns an instance of a NoLadder class.  NoLadder is an empty class with no methods or fields or properties.  How is this different than returning nil?  It's different in the exception you'll get. Instead of "NoMethodError: undefined method `first' for nil:NilClass" you'll get "NoMethodError: undefined method `first' for #<noladder:0x000001010652f0>".

That's awesome!  It says right there that your problem is you're holding a NoLadder.  And NoLadder only comes from one place in your code, so you know exactly and immediately what the problem is.  Contrast that with a null reference exception, which could come from anywhere.

In a static language we can approximate this with the Null Object pattern by creating a Ladder singleton instance called NoLadder.  But this is not the same thing.  The Null Object pattern usually returns an instance which wouldn't cause an exception, but instead would do nothing.  Personally, I've always found this a bit confusing and scary, especially if the object returned would normally have behavior.  The other major difference, is there may be times where the sentinel is very specific to a given function,  and defining a null object on your class for just one little function isn't very cohesive.

In C#, null is not an object like it is in Ruby.  But if it WAS, maybe we could do something like:
public class NoLadder : Null { }
Then we could return "new NoLadder()" in place of null.  Crazy I know, but the ability to put a name on null would be huge!

Metaprogramming
Just look at ActiveRecord and you'll see the amazing power of Metaprogramming in Ruby.  C# has reflection, but it can't even come close to the ability to generate types.  Ruby style metaprogramming can't exist in a static language, so I think what I really want instead is Macros.

Or if not full fledged Macros, then F#'s type providers.  If you haven't seen these yet, you should totally take a look.  They're amazing!  They generate types at compile time.  That may not seem too exciting, until you see the IDE integration...  They generate types AS YOU TYPE!  It's very cool, and you never need to re-generate a generated code file or anything, it's totally seamless.  Like ActiveRecord, but with compile time types!

Functional Envy:
No Nulls
What's the most common exception you encounter in a static language like C# or Java?  NullReferenceException.

F# does have null, but only for interoperating with .NET.  F# code itself doesn't allow null, values must have a value at all times.  This is accomplished with option types, which are the same as .NET's Nullable.  So with enough discipline, you could do this in C#.  But I think it would be awesome if you could turn on a compiler switch in C# to disallow null values.

I should point out that this is a much more strict form of the Sentinel Values mentioned above.  Sentinel Values still blow up when you hit them, they just make it easy to understand why.  No Null prevents the blowup entirely.  Between the two, I think I'd go for No Nulls because it completely eliminates an entire class of programmer error.  However, the cost is some more syntax noise.  Nullables arn't pretty:
var ladder = FindLadder("nice", "mile");
if (ladder.HasValue)
  PrintLadder(ladder.Value);
else
  PrintNoLadder();
Sentinels would be "nicer" and "more elegant" and more "aesthetically pleasing" in cases where the null is a rare degenerate case.

Also worth noting is that while the compiler doesn't exactly have that "no nulls" switch I mentioned, you can approximate this with Code Contracts and static checking.  I haven't tried this yet, but it looks pretty useful and it's on my list.

Immutability
After NullReferenceException, I'd wager the next most common programmer error stems from side effects: "How did the value of this variable change?!"  As I've been studying and practicing functional programming I've been stunned by how ingrained mutable side-effect programming is in my head.  Simply put, I think that way, and thinking any other way is very difficult.

I'm not convinced yet that immutability is better for all problems in all places, but I am convinced that it's a less bug prone way to develop.  So I wish C# had widely used immutable data structures.

Of course, what this really means is you're writing recursion instead of loops.  I find recursion to be more declarative, especially when combined with pattern matching, which I like.  But I also find it requires more thinking to understand, which makes it "harder".

Pattern Matching
I don't think I really need to say much here.  Pattern Matching is just completely and totally awesome.  It requires dramatically less code, is much easier to understand and read, and just generally rules.  However, it does require more advanced language-integrated data structures.

Advanced Integrated Data Structures
Can we get a freaking dictionary syntax?!

Admittedly, F# doesn't have one either, but F# DOES have a beautiful list and tuple syntax, which can be used to easily get you a map:
let d = [1, 'a'; 2, 'b'; 3, 'c'; 4, 'd']
let m = Map.ofSeq d
I desperately wish I had nice syntax for basic data types like this.  This is one of the things I love about powershell too!  And when you have pattern matching that also understands this syntax, that's an amazingly expressive and powerful mix!

So there's a sampling of some of my language envy.  I'm sure there are others I forgot to include, and I bet you have yours too, so leave 'em in the comments, or on your blog, or tweet at me!

Wednesday, June 20, 2012

Don't trust your instincts

Ours is a young discipline with no rules or laws except the ones we choose for ourselves.  We are still living in the wild wild west of software development.  There is no exam to pass to become a software developer.  There are no standard evaluation procedures, or checks and balances.  If you can make your software execute, you can install it, host it, and sell it.  This is one of the things I find very attractive about our industry, but also occasionally frustrating.

There have been many attempts to standardize engineering in the form of processes and methodologies.  But process is only a small -- and very uninteresting -- portion of building software.  Code itself is far more challenging, interesting, and diverse.  But it is very lacking in recognized rules or techniques or principles or even just ideas.  Certainly there are some code principles and practices, but how successful or accepted are they?

It's hard to get a true sense of the opinions and practices of our industry, but there is clearly a very vocal minority that eschews "software engineering" practices in favor of a loosely defined aesthetic.  I'll use "software engineering" as a label for structured principles, patterns, and practices.  For example, consider the Gang of Four's design patterns, or Bob Martin's SOLID principles.  But the vocal minority, which seems to me at least to be getting increasingly vocal these days, would argue these concepts (patterns and principles) are more harmful than helpful.  That a better approach is to simply take the time to feel the pain in your code, and adjust, rewrite, and refactor as needed.

A really solid example of this argument being made can be heard in this Ruby Rogues podcast interview of DHH.  If you stick with it, the conversation covers a lot of really interesting topics including how DHH applies this thinking to rails and basecamp, YAGNI, thoughts on education and the necessity of stubbing your toe to learn, and more (Thanks to Lee Muro for referring me to that podcast).

I agree that stubbing your toe is a good teacher, but I don't think it's the only way to learn.  I agree that abstract concepts are easy to over use and misapply, especially after first learning about them, but I don't think that's inevitable.  While I find the refactoring and continuous learning part of this attitude very pragmatic, there is one element I do disagree with: the idea that we don't need abstract rules and principles and guidance and science.  That all we need is our sense of aesthetic.  The idea that by simply looking at some code, maybe comparing it to a different version, you can derive an intuitive understanding of which code is better.

I don't buy this, because I don't think that's how humans work, as outlined by Malcolm Gladwell's book Blink and this article by Jonah Lehrer.  I recommend them both, but if you're short on time, just read the Jonah Lehrer article as it's short and the most directly relevant.

Blink is all about the influence our subconscious mind has on us.  We like to think that we are rational and in full conscious control of what we do and what we think.  But Blink has plenty of research to prove that this simply is not the case.  We depend on our subconscious to make snap decisions and influence our general mood and thoughts much more than we realize.  And Blink goes to great lengths to present the fact that this can be both very powerful and harmful.  Your mind is capable of "thin slicing" a situation, pulling out many relevant factors from all the thousands of details, and coming to a conclusion based on those details.  But, not surprisingly, you need both extensive practice AND exposure to all the needed factors for this to work.  And it's worth mentioning that even when it does work, your conscious mind may never understand what it was your unconscious did to come to it's conclusion!

You might read that and think, "Experts can use their unconscious to recognize good and bad code, the vocal minority is right!"  I believe that is true, but only on a local level.  When you look at code, you are always drilled in to the lowest level.  I think you could intuit a fair amount at this level, but it's the higher concepts that have the larger influence, and I'm not sure you can effectively thin slice that stuff.  Many of the concepts of good architecture are about higher level structure: low coupling, high cohesion, SRP, ISP, DRY.  But if I showed you one code file and asked you to tell me if the application suffered from coupling issues, you wouldn't be able to say.  And that's because you haven't been provided with enough information.  And without that information, how can you possible thin slice your way to an intuitive understanding of good code?  I worry that a focus on "aesthetic" and "elegance" leans too heavily on this intuitive feel for code, and carries a serious risk of leading you down a path that feels smooth and easy, but ultimately leads straight into the woods.

But I would take this argument even further.  Jonah Lehrer's article tells a story of a psychology experiment that went something like this.  Study participants were shown two videos, both showed two different sized balls, one larger than the other, falling toward the ground.  In one video the balls hit the ground at the same time, and in the other the larger ball hit the ground first.  The participants were asked which video was a more accurate representation of gravity.



And the answer is: the video where they hit the ground at the same time is the correct one.  This is not intuitive, most of us would expect the larger ball to hit first.  So the way the world actually works comes as quite a surprise.  But where this gets interesting is in the second part of the study.  This time, the participants were all physics majors, who had studied this and learned the correct answer.  The participants brains were being monitored with an fMRI machine and what the researchers discovered is that in the non-physics majors a certain part of the brain was lighting up which is associated with detecting errors, the "Oh-shit! circuit" as Jonah calls it.  When they saw the video of the balls hitting the ground at the same time, their brains raised the bull shit flag.  So what was different with the physics majors that allowed them to get the right answer?
But it turned out that something interesting was happening inside their brains that allowed them to hold this belief. When they saw the scientifically correct video, blood flow increased to a part of the brain called the dorsolateral prefrontal cortex, or D.L.P.F.C. The D.L.P.F.C. is located just behind the forehead and is one of the last brain areas to develop in young adults. It plays a crucial role in suppressing so-called unwanted representations, getting rid of those thoughts that aren’t helpful or useful.
This other section of the brain allows us to override our intuitive primal expectations, the Oh-shit! circuit, and replace them with learned ones.  But in order for this circuit to work, you must have studied and learned the material!  Which requires that there be something to learn!

The connection to the aesthetic instinctive approach to software should be pretty clear.  If you shun what "science" our industry has to offer, however admittedly weak and young it may be, you're not training your brain to suppress the intuitive but worse-for-you-in-the-end code!

So I think it's important to be cautious when relying on your intuition and sense of aesthetic, especially in an industry as young as ours with so little widely accepted guidance.  We need to follow that pragmatic approach of continuing to learn, but at the same time we have to continue to question our intuition.  And just as important, we should take the science/engineering of our industry seriously, even while recognizing it's limitations.  


Software is hard, be careful how much you trust your instincts!

Monday, June 11, 2012

Word Ladder in F#

Anthony Coble sent me his solution to this in Haskell before his Haskell talk at Burning River Developers.  The problem was to find a Word Ladder between two words.  A word ladder is a chain of words, from a start word to an end word, in which each word in the chain is one letter different from the word before it and is a real word.  The goal is to find the shortest ladder between the start and end word.  It's a neat problem, made even neater by the fact that it was invented by Lewis Carroll!  So I couldn't resist giving it a try in F#.  And fortunately, I don't know Haskell so I couldn't understand Anthony's solution, so I had to figure it out all on my own.

Speaking of which, this is a fun problem.  So if you want to give a try, you should stop reading here and come back after you've solved it!

I found this problem to be surprisingly difficult.  I kept coming up with ideas but inevitably they were over simplified and wouldn't work.  I was visualizing it as a tree.


It's an interesting problem because it has three different cases to consider.  It's normal for a problem to have two cases, like the first item in the list, and every other item.  But at least with the way I approached it, this problem had three cases: the first node (which has no parent node), the nodes in the first level of the tree (which all have the same parent), and every level after that (where there are many different parents).  This kept causing me to come up with ideas that only worked for the first level, or ideas that worked for one of those levels and not the others...

I knew that I needed a breadth first search but I was really struggling with how to implement it while also keeping track of the path to each node.  Usually a breadth first search effectively just loops over all the nodes in each level, and then calls the next level recursively.  But I need to know what the parent of each node is, and what that parent's parent is, etc up to the root.  My solution to this was to represent each node as a tuple containing the list of it's full path from root to that node, and the word of that node.  This is what I passed through the recursive calls, therefore I always knew the full path to each node.  This was simple, but has the downside of wasting memory since it duplicates a significant portion of the parent list on each node.

Another interesting element of this solution is that I prune words out of the tree that I've already seen, like the two red nodes in the picture above.  This means I don't have to worry about "infinite loops" (they're not really loops, but you get what I mean) in the graph, so I can search until I run out of nodes.  And its safe, because I need to find the shortest ladder, so if I've already encountered a word earlier in the graph, that same word can't be part of the solution later in the graph.

The git repo with my solution is here: https://github.com/kberridge/word-ladder-kata-fs
And here's the code:


Here are some notes on some of the things I found interesting about this code:

  • The array slicing code in the findChildren function was fun.
  • And using the range operator in the generateCandidates ['a'..'z'] was fun too.
  • The findValidUniqueChildren function is an excellant example of something I've been struggling with in F#.  What parameters does this function take?  What does it return?  It's NOT easy to figure this out, is it?  This is also the first time I've used function composition for real!
  • Notice in the queuechildren method how I'm using the concatenate list operator: "let newSearchNodes = searchNodes @ childnodes"?  The Programming F# book says if you find yourself using this it probably means you're doing something in a non-idiomatic way...  I suppose I could have written buildNodes to append the nodes to a provided list, but that seemed awkward.
  • The match of findLadderWorker is pretty typical, but demonstrates another little pattern I keep finding.  For the second match, I created a function to call since it's longer than 1 line, so I had to make up a name.  I went with "testnode" which I don't really like, but I had to name it something!
Here's Anthony Coble's Haskell solution.  His is very different and shows a great example of a different way to approach the problem.


If you solve it, in any language, send me the link to your solution and I'll add it to the post!

Wednesday, June 6, 2012

Book: Object Thinking



Object ThinkingObject Thinking by David West
My rating: 2 of 5 stars

There were two things I really enjoyed about this book.  The first was the discussion of different schools of thought in philosophy and how those ideas appear in software.  The second was the history sidebars that introduced different computer scientists and explained their contributions to the field.

The basic thrust of the book was simply that you should write your applications as a a bunch of objects whose intercommunication results in the emergent behavior of your application.  And further, that your models should attempt to model the real world objects and concepts of your domain.

That's great and all, but the book provides no concrete examples.  None.  And it makes a huge number of assertions about how much better this approach is and how everything else is inferior, but with nothing to back those statements up.  Nothing.

So in the end, I'm left feeling like there are probably some good ideas in there, but I'm totally unconvinced that the author has ever written a real business application.  And further, I think he might be just a grumpy old dude who's sad that Small Talk lost out to more mature and practical languages like C++ and Java.

View all my reviews

The primary things I found interesting and took away from this book are:

Hermeneutics
"According to the hermeneutic position, the meaning of a document—say a Unified Modeling Language (UML) class diagram—has semantic meaning only to those involved in its creation".  The author argues that XP methods are influenced by Hermeneutics and are therefore better suited to software creation than traditional software engineering formal methods.  "One of the most important implications was the denial of “intrinsic truth or meaning” in any artifact—whether it was a computer, a piece of software, or a simple statement in a natural language. This claim is also central to the school of thought that has been labeled postmodern. It is also one of the core claims of all the hermeneutic philosophers."

Traits of Object Culture
  • A commitment to disciplined informality rather than defined formality
  • Advocacy of a local rather than global focus
  • Production of minimum rather than maximum levels of design and process documentation
  • Collaborative rather than imperial management style
  • Commitment to design based on coordination and cooperation rather than control
  • Practitioners of rapid prototyping instead of structured development
  • Valuing the creative over the systematic
  • Driven by internal capabilities instead of conforming to external procedures
Given how little of the rest of the book I was able to buy into, I was surprised by how closely this list of culture traits aligns with my own ideals.

Emergent Behavior
"Traffic management is a purely emergent phenomenon arising from the independent and autonomous actions of a collectivity of simple objects—no controller needed."  This is really the core of what the entire book is arguing for.  That the behavior of the system should emerge from the communications between simple objects.  It's a very interesting concept.  But I'm not 100% sure it's one I'm ready to totally buy into.  He uses a model for an intersection with cars and a traffic light as an example.  The traffic light doesn't know anything about the cars.  It's just a glorified timer that notifies any subscribers by lighting different colored lights.  Cars, in turn, don't know anything about the other cars, or the other streets.  They just monitor the traffic light.  There are two huge benefits I see to this.  First, the loosely coupled nature of the design allows you to introduce new kinds of cars (trucks, motorcycles, even pedestrians!) without changing any of the other participating objects.  And second, it allows arbitrarily complicated intersections to be modeled without requiring any complex code.

But in the back of my head, I'm always a little bit nervous about this...  The fact that the behavior is emergent is a benefit, but also a draw back because there is no one code file you can read that will describe the behavior.  You must figure it out by running simulations in your head of how all the participants interact.  There are certain problems where these clearly would not be acceptable, and Object Thinking does make this point: "Specific modules in business applications are appropriately designed with more formalism than most. One example is the module that calculates the balance of my bank account. A neural network, on the other hand, might be more hermeneutic and object like, in part because precision and accuracy are not expected of that kind of system."  So the bigger question for me, not addressed in the book, is what types of problems would this emergent be acceptable for?  Ultimately I suspect it's a razor's edge issue, at some point the complexity of the solution may make the switch to emergent result in simpler and more understandable code.

Monday, June 4, 2012

Finding Connected Graphs in F#

I've been learning F# and functional programming.  I first got interested in functional languages when Ben Lee introduced me to Erlang at CodeMash by doing the greed kata with me.  Then I bought Programming F# and started to learn the F# language.  As I was going through the book, I was playing with a little Team City/FogBugz integration script in F#.  Then Anthony Coble told me he wanted to do a talk at Burning River Developers on Falling In Love With Haskell.  His talk was great, and Haskell looks like a mind-blowing language.

So that is how I got interested in playing with this stuff.  In order to actually learn it, and not just read about it, I've been finding and inventing little exercise problems and solving them in F#.  F# is a multiparadigm language, but I've been focusing on the pure Functional portions of the language for now.  I thought I'd share these problems and my solutions on the blog.  The best thing that could come from this is if you, dear reader, would also solve these problems in the language of your choice and share back your solution.  Or if you know F# but don't want to actually fully implement a solution to these, I'd love feedback on how you might have solved the problems differently.

The first problem I want to share is a Graph parsing problem:
Given a graph G, return a list of all the connected graphs in G: [G1, G2, ...].
Here's an example input graph and the expected output:
 The git repo with my solutions is here: https://github.com/kberridge/find-connected-graphs-kata-fs

And here's the code:

Some of things I really enjoy about this code:

  • I love nesting simple little functions inside other functions to give names to boilerplate code (ex: isSeen and notSeenOnly)
  • The recursive walk method is stunningly elegant
  • "Map.ofList g" is a cool line of code.  g is a Graph record, but it's defined as a list, and so can be treated like a list.
  • List.collect combines all the returned lists into one list
  • It's also cool that the map variable is in scope inside of the walk function because it's closed over from the findConnectedGraph method.
  • The recursive list comprehension of findAllConnectedGraphsWorker totally blows my mind.  Using yield as expected, but then calling itself with yield! is crazy!
I'm sure there is a lot about this code that could be improved.  There are probably better algorithms too. I'd love to hear your ideas and read your implementations of this problem!

Wednesday, May 30, 2012

Book: Windows Powershell in Action


Windows Powershell in ActionWindows Powershell in Action by Bruce Payette
My rating: 5 of 5 stars

One of the most enjoyable specific technology focused books I've ever read.  Usually books that teach you a language or a framework are pretty dry and uninspiring, but this one was great.  The examples used are good at illustrating the points without going overboard.  But by far my favorite part were the little asides where the author explains difficult design decisions the PowerShell team had to make.


View all my reviews

Tuesday, May 29, 2012

Minor Issues: Query Results vs. Models

I want to take a look at a minor issue that crops up in a very common application structure where you have a list of data, possibly from a search, that the user selects from to view details.

There are some minor issues that must be addressed, and they all have to do with queries especially when we're dealing with SQL.  There will be a query that returns the list by gathering all the data, maybe doing some formatting, and joining to all the relevant tables.  For example, if it's a list of books it well return the title, publish date, author (join to author table; format name), and genre (join to genre table).

Apart from listing the books, the app also needs to be able to add new books.  This will work as follows:
  1. A dialog pops up with all the fields to fill-in
  2. On save, if everything validates, the book is saved in the database
  3. The new book is added to the list with AJAX (did I mention it's a web app?)
Since I don't want to leave you hanging, here are the "minor issues" I'm going to look at:
  • Query performance (N+1 Select)/Query complexity
  • Formatting logic
  • Type conversion
To illustrate my points, I'll use the Active Record pattern.  Using the book example, a naive implementation of the query might look like this:
var books = Books.All();
foreach(var book in books) {
  // display the data by accessing it this way:
  book.Title
  book.PublishedDate.ToString("MM/dd/yyyy")
  book.Author.FormattedName
  book.Genre.Name
}
Some things to note about this code:
  • It suffers from the N+1 Select problem because for each book it does a query to lazy load the author and another query to lazy load the Genre (technically that's N+2).
  • It formats the date with a .NET format string.
  • It formats the author name using the format logic built in to the Author class in the FormattedName property
The first is a serious issue that we *must* correct, but there isn't anything inherently wrong with the other two.  

Query performance/complexity
To fix the N+1 Select problem, eager loading could be applied.  Eager loading is a tool of ORMs that includes joins in your query an expands those into referenced objects without a separate database call.   Entity Framework, for example, as a nice method called Include so you could write .Include("Author").Include("Genre").  NHibernate allows you to define this as part of the mapping.

This solves the N+1 Select problem, and is generally good enough for a simple example.  But when the query is more complicated using the ORM to generate the SQL can be troublesome.  And it's worth pointing out that written this way, the SQL will return all the fields from all the rows it joined to and selected from, even if only a small subset is needed.  This may or may not affect performance, but it will impact the way indexes are defined.

The N+1 Select problem can also be solved by not using Books.All(), and instead writing a SQL query to do the necessary joins and come back with only the required data.  There are two clear benefits to this:
  1. Using SQL directly means there are no limits on what features of the database can be used.  Plus, the query can be optimized however needed.
  2. Only the required data fields need to be selected, instead of all the fields.  And data from more than one table can be returned in one select without fancy eager loading features.
To represent the results, a Query Result class can be defined.  This class will be very similar to the AR models, but only contain properties for the returned fields.  

Formatting Logic
But this is where those two other bullet points from earlier come into play.  Remember how the date was formatted with a .NET format string?  In a custom query, this can easily be moved into the query result object.  It's the formatting of the author name that is going to cause some trouble.

Pretend there are three columns that represent name: FirstName, MiddleName, LastName.  There are three choices for how to format this into a single name display:
  1. Put the formatting logic in the select statement of the SQL query (duplicates the logic on Author)
  2. Put the formatting logic in a property of the query result object (duplicates the logic on Author)
  3. Refactor Author and call it's method to format the name (awkward)
To explain, here's what Author might have looked like:
public class Author {
  ...
  public string FormattedName { get { return FirstName + " " + MiddleName + " " + LastName; } }
}
This formatting logic is coupled to the fields of the Author class, and so it can't be reused. To make it reusable, it could be refactored into a function that takes the fields as parameters. One way might look like:
public class Author {
  ...
  public string FormattedName { get { return FormatName(FirstName, MiddleName, LastName); } }
  public static string FormatName(string first, string middle, string last) {
    return first + " " + middle + " " + last;
  }
}
This is now in a format that could be used from within our query result object:
public class BookListResult {
  ...
  public string FormattedName { get { return Author.FormatName(FirstName, MiddleName, LastName); } }
}
Part of me loves this, and part of me hates it.

Type Conversion
The other issue that must be dealt with when using the Query Result approach, deals with the AJAX part of our scenario.  Remember how we wanted to add the book to the top of the list after the add?  Well our view that renders the list item is going to be typed to expect a BookListResult, which is what the query returns.  However, after the Add, the code will have a Book instance, not a BookListResult.  So this requires a way to convert a Book into a BookListResult.  I usually do this by adding a constructor to BookListResult that accepts a Book, and that constructor then "dots through" the book collecting all the data it needs.

From a certain perspective, this can be viewed as duplicating the query logic because knowledge of what fields the QueryResult's data comes from appears in two places: once in terms of the physical SQL tables in the SQL query, and again in terms of the Active Record objects.

Yet somehow I still prefer the Custom Query approach to the eager loading approach...  I just like to have that absolute control over the SQL query.  The cost of the boilerplate code here is worth it to me if it means I can directly leverage the query features of my database (like row number, and full text, and CTEs and pivots, etc etc).

As in the last "Minor Issues" post (constructors and MVC controllers), I'd love to hear your thoughts or experiences with these patterns.

Thursday, May 24, 2012

Hg Bookmarks Made Me Sad

Branches
Hg's branches are "permanent and global", meaning they are fixed to the changeset, can't be deleted, and can't be left behind when you push changes on that branch.

This is in contrast to git's branches, which are temporary and are not part of the changesets.  I think of them as pointers.

It can be nice to have branch names on your commits, because it adds some meaningful context to the commits.  It makes understanding your history very easy.  The only downside that I am aware of is the potential for name collisions.  Someone might try to create a branch using a name that someone else had already used.  In which case you should really just use a different name...  If there are other downsides, I don't know what they are.

Workflow
However, it has always been the recommendation of the Mercurial team that branches be used for long lived branches, not short term topic branches.  Pre-2.0 they probably would have recommended using local clones, now they recommend using Bookmarks.  I've found local clones less than ideal for my workflow, which typically looks like this:
  1. Get latest changes
  2. Decide what task to work on next
  3. Create a branch-line to work on that task
  4. Hack hack hack
  5. If it takes awhile (>8hrs), merge default into the branch-line
  6. If I have to go home for the night and I'm not done, push branch to remote as backup
  7. Push branch-line to remote, have someone do a code review
  8. Land branch-line on default, close branch-line
Some things to note about this:
  • The branches are small and generally short lived, one per "topic"
  • I want to push them to the remote for backup purposes (in case my computer fries over night or something)
  • I want to push them to remote so others can collaborate
This is why named-branches are so much more convenient for me than local clones.

However, in a recent release of Mercurial, they added a new notification message when you create a branch which says: "(branches are permanent and global, did you want a bookmark?)"  So they couldn't be much more clear and in my face about the fact they think I should be using bookmarks for my workflow instead of named-branches.

Bookmarks
Bookmarks are basically Mercurial's version of git's temporary short lived branches.  It means I'll lose the nice branch names on my commits in history.  But I wont have to worry about name conflicts.  This already doesn't seem like a worthwhile trade, but I'm willing to take the Mercurial dev's word for it and try it out.  Sadly I found them, in their current state (2.2.1), to be bug prone and impractical.  For the remainder of this post, I'd like to explain what I don't like about them as they are now.  But since I don't want to have to explain the basics here, you should go read about them first: http://mercurial.selenic.com/wiki/Bookmarks.  I'd like to throw this one little caveat in before I start, which is to say that it's totally possible I am miss using these things.  I sincerely hope that's the case and someone will point out a better way to me.  But I couldn't find any good real workflow examples of bookmarks, so I had to figure it out on my own.

Must create my own 'master' bookmark, everyone on the team must use this bookmark
When I create my first bookmark and commit on it, I've just created two heads on the default branch. I can easily find the head of my topic branch, it has a bookmark, but how do I find the head of the mainline?

Worse, say I publish the bookmark to the remote server and you do a pull -u.  You will be updated to my topic bookmark, because it's the tip.  That is NOT what either of us wanted.  I created a topic branch because I didn't want it to get in your way.  In fact, you shouldn't have to be aware of my branch at all!

So bookmarks are broken before we even get out of the gate.  The work around is to create a 'master' bookmark that points at the mainline head.  Everyone on the team will have to aware of this bookmark, and they'll have to be careful to always update to it.

Must merge 'master' with 'master@1' and bookmark -d 'master@1'
The next problem happens when you and I have both advanced the master bookmark.  In the simplest case, maybe we both just added a new changeset directly on master.  Lets say you push first, and I pull.  If we weren't using bookmarks, hg would notify me when I pulled that there were multiple heads on my branch and it would suggest I do a merge.  So I'd merge your update with my update and be back to only one head on the branch.

With bookmarks, it's more confusing.  Hg will notify me that it detected a divergent bookmark, and it will rename your master bookmark to master@1 and leave it where it was.  It will leave mine named master and leave it where it was.  Now I have to "hg merge master@1; hg bookmark -d master@1;"

As a side note here, I was curious how git handles this problem, since git's branches are implemented so similarly to hg's bookmarks.  The core difference is that git wont let you pull in divergent changes from a remote into your branch without doing a merge.  It's conceptually similar to renaming the bookmark to master@1, since what git technically does is pull the changes into a "remote tracking branch" (that's a simplification, but close enough), and then merge that remote tracking branch onto your branch.  But it has a totally different feel when you're actually using it.

Can't hg push, or it will push my changes without my bookmark
This is the most devastating issue.  If I have created a new topic bookmark and committed on it, and then I do "hg push", it's going to push my changes to the remote without my bookmark!  The bookmarks only get pushed when you explicitly push them with "hg push -B topic".  Which means if I'm using bookmarks, I can't ever use the hg push command without arguments, or I'm going to totally confuse everyone else on the team with all these anonymous heads.

It's true that as long as the team is using the master bookmark and their own topic bookmarks, they shouldn't really have any problems here...  But it's still totally confusing, and totally not what I wanted.

Suggestions
The Mercurial team feels very very very strongly about maintaining backwards compatibility.  So it's probably a pipe dream to hope that this might change.  But I have two suggestions on how these problems might be mitigated.  These suggestions probably suck, but here they are anyway.

Hg up should prefer heads without bookmarks
If I do a pull -u and it brings down a new head, but that head has a bookmark, hg up should update to the head WITHOUT the bookmark.  This would allow me to use bookmarks without them getting in the way of other members of the team.

I think it would also allow me to not have to create the 'master' bookmark.  When I wanted to land a topic bookmark, I would just do: "hg up default; hg merge topic; hg ci -m "merged topic";"  Since "default" is the branch name, hg would prefer the head without bookmarks, which would be the mainline.

Hg push should warn if pushing a head with a bookmark
This would be consistent with hg's treatment of branches.  When you hg push, if you have a new branch, it aborts and warns you that you're about to publish a new branch.  You have to do hg push --new-branch.  I think it should do the same thing for bookmarks.  This would prevent me from accidentally publishing my topic bookmarks.

I <3 Hg
I really like Mercurial.  Even in the hg vs. git battle, I tend to prefer hg.  I love how intuitive it's commands are, I love how awesome it's help is, I love it's "everything is just a changeset in the DAG" model (vs. git's "you can only see one branch at a time, what's a DAG?" model).  And that's why bookmarks are making me sad.  Every time I create a branch, hg tells me I'm doing it wrong, but bookmarks are way too unfriendly right now (unless I'm missing something huge [it wouldn't be the first time]).

I still strongly recommend Hg.  If you're still using CVS, or Subversion, or heaven help you TFS, you should take a look at Mercurial.

And if you're a Mercurial expert (or a Mercurial developer!) please help me understand how to use bookmarks correctly!

PS.  I thought about drawing little graph pictures to help explain the issues I laid out here, but I don't have a decent drawing tool at my disposal, and I didn't think this rant really deserved anymore time than I already put in.  Hopefully you were able to make sense out of all these words.

Monday, May 21, 2012

Minor Issues: Constructors and MVC Controllers

Recently I've been getting into F# a little.  It's a really cool language which has been broadening my perspective on problem solving with code.  It's a .NET language, and is mostly compatible with C#, but it does do some things differently.  For example, it has more rigorous rules around constructors:
type Point(x : float, y : float) =

  member this.X = x
  member this.Y = y

  new() = new Point(0.0, 0.0)

  new(text : string) =
    let parts = text.Split([|','|])
    let x = Double.Parse(parts.[0])
    let y = Double.Parse(parts.[1])
    new Point(x, y)
It may not immediately jump out at you, but there are some really cool things here that C# doesn't do:
  1. This actually defines 3 constructors, the "main" constructor is implicitly defined to take in floats x and y
  2. Constructors can call other constructors!  Note the empty constructor, new().
  3. All constructors ARE REQUIRED to call the "main" constructor
I fell in love with this immediately.  This requirement forces me to be keenly aware of what my class's core data really is.  And it communicates that knowledge very clearly to any consumers of the class as well.  This was especially refreshing because I have found that since C# added object initialization syntax (new Point { X = 1.0, Y = 2.0 }) I've started writing far fewer constructors.  Constructors are boilerplate and annoying to type, so I largely stopped typing them.  But now that I have done that for awhile and I have a few real world examples of classes without constructors, I find that I miss the constructors.  They communicate something nice about the core, and most important data, of the class that a whole lot of properties doesn't communicate.

So, that sounds pretty straight forward, and I should start writing constructors on my classes again.  And if I want to be really strict (which I kind of do), I shouldn't provide a default constructor either.  Then I'll be living in the rigorous world of class constructors, just like F#.

But this is where MVC Controllers finally make their first appearance in this post.  Because these controllers exert there own pressure on my classes and down right require a default (parameter-less) constructor.  At least that's the case with the way my team writes our controllers.  Why?  Here's an example.

Let's talk about CRUD.  Typically there's a controller action we call "New", it returns an empty form so you can create a new record.  This form posts to a controller action we call "Create", which binds the form data to a model, and calls .Save().  We're using MVC's standard form builder helpers, which generate form field names and IDs based on the model expression you provide as a lambda.  This is how it knows how to bind the form back to your model on "Create".  But this means you have to pass an empty model out the door in the "New" to generate the empty form.  An empty model requires a default constructor!  So the code looks like this:
public ActionResult New()
{
  ViewData.Model = new Point();
  return View();
}

[HttpPost]
public ActionResult Create(Point point)
{
  point.Save();
  return View();
}
Obviously, real code is more complicated than that, but you get the idea.

And so I find myself with a minor issue on my hands.  On the one hand, I want to create F# inspired rigorous classes.  But on the other hand I want simple controllers that can just send the Model out the door to the view.  Alas, I can't have both, something has to give.

Obviously I could give up on the Constructors.  Or, I could give up on passing my model directly to the view.  There are other approaches well documented in this View Model Patterns post.  The quick description is I could NOT pass my model, and instead pass a View Model that looks just like my model.  And then I'd have to map that View Model back onto my Model somehow...  But that comes with it's own minor issues.

So how about it?  How do you deal with this issue?

Monday, May 14, 2012

Powershell Listing SQL Table Columns

Powershell has an awesome utility called sqlps that both lets you execute sql statements against a database, and implements a file system provider over SQL.

One of the things I use this for all the time is to inspect the columns of a table.  Management Studio's tree view is terrible for this, especially compared to the flexibility of powershell which allows you to do things like:

  1. Sort the columns by name, or by data type
  2. Filter the columns by name, or by data type, or by nullable, etc
Here's a series of commands I use a lot that I thought was worth sharing:
  1. sqlps
  2. cd sql\localhost\default\databases\mydatabase\tables\schema.table\columns
  3. ls | where { $_.Name -notlike 'ColPrefix*' } |  select Name, @{Name="Type"; Expression={"$($_.DataType.Name)($($_.DataType.MaximumLength))"}}, Nullable
That will display all the columns that DO NOT have a name starting with ColPrefix and will show you each columns Name, Data Type (formatted like "nvarchar(255)"), and whether it allows nulls.

Enjoy!

Tuesday, May 8, 2012

Selfish Programmers: less flame-baity

Last post too flame-baity for you?
 Fair enough!

It's far too easy to confuse "easy" with "simple."  Rich Hickey touches on this a bit in this presentation, which is very similar to another talk he gave that I blogged about earlier.  Almost every thing he said in these talks was very thought provoking for me.  But the one that really hit home the hardest was this concept of easy vs. simple.

The difference between easy and simple is rather hard to firmly pin down.  One way to think of it might be that easy means less effort.  Fewer keystrokes, fewer concepts.  The less I have to type, the easier it is to do something.  The less I have to know, the easier it is.  The fewer structures between me and what I'm trying to accomplish, the easier.

But easier doesn't necessarily mean simpler.  Hickey associates simpler with un-twisted.  So code that is DRY, and SOLID would be simple.  Even if it requires more keystrokes, classes, and curly braces to write.

I find myself falling for this a lot.  Sometimes it might be simpler to do more work, but that's hard to see.  On the other hand, it's incredibly easy for me to judge how fun something will be for me to do, or how much tedious effort something with require.

The problem is that EASY is about me, where SIMPLE is about my code.  So the deck is stacked against us as software developers.  It's going to be difficult to separate whats easy for us from what's simple for our code and make the right design decision.

Being aware of this distinction is useful.  And I certainly wasn't as aware of it before watching Hickey's talk.  But it does raise an interesting question of how can we keep ourselves honest?  How can we notice when we're doing the easy thing instead of the simple thing?  While at the same time avoiding doing too much and over complicating?

Monday, May 7, 2012

Selfish Programmers

The biggest movement in software today is selfishness.

Ok, I've only been here for a short time, so what do I know, maybe it's always been like this.  And people being selfish doesn't really constitute "a movement" (though I wouldn't be surprised if some people would be willing to argue that our generation is a very selfish one, I'm not sure how you would prove that we're more selfish than previous generations were at our age...).

What DOES constitute a movement is the continuous push toward tools that make a programmer's job "easier."

Yeah, you got that right, I'm about to take a stance against making things easier.  Here's some examples of tech stuff that's supposed to "makes things easier":
  • Visual Studio
    • WinForms/WPF designers
    • WCF
    • Solutions and project files
    • Entity Framework (database first)
    • Linq-to-Sql
  • Active Record
  • DataMapper (ORM)
  • Convention over instead of configuration (I'm looking at you Rails)
  • ASP.NET MVC's UpdateModel (and Validations)
  • All Ruby code ever written
This stuff is "easier" because it requires "less work".  Usually boring tedious work.  No one likes boring tedious work (except on Monday morning when you're really tired).  So naturally we try to get rid of that work.  There are different strategies for getting rid of it.  Microsoft likes to get rid of it by replacing it with drag and drop and magic tools that hide the details of what's going on so you don't have to learn anything.  Ruby on the other hand puts extreme focus on minimalism and pretty syntax, favoring as few keystrokes as possible.

But do we think about what that's costing us, or costing our future selves?  Nope!  We're selfish sons of bitches and all we care about is doing stuff that we enjoy and think is "elegant" with as few keystrokes and effort as possible!

We like drag and drop and magic tools because it saves all that time learning things and typing.  Unfortunately, it also dramatically reduces flexibility, so as soon as you step outside the boundary of the demo application, you find yourself stuck.  Now the complexity of your solution skyrockets as you hack around the limitations of the drag and drop and magic.

And we like minimalism, cause it feels like simplicity.  Our model not only looks exactly like the database, but it's automatically created by reflecting over the database, and then we send those models directly to the UI and mash them into our views!  ITS SIMPLE!  Well, it's less code, but is it simple?  You've intimately twisted the knowledge your database through your entire code base, leaving you with the pressures of physical storage, UI layout, and controllers all tugging on the same objects.  Every time you remove a layer or a concept from your code to make it simpler, you run the risk of tangling concepts and paying for it in the long run (Of course, every time you add a layer or a concept you run the risk of over abstracting, it's a balance).

In conclusion: stop being so selfish!  Sometimes the best way to do something isn't the most fun, or elegant, or easy way.  Toughen up!  Do the work!

Dogma-less TDD and Agile


TDD is about rapid feedback of the code to the developer.  Agile is about rapid feedback of the application to the development team.

Everything else is just BS.

Here are some of the things that fall into the BS category:
  • Up front design
  • "Architecture"
  • Project plans
  • Estimates/Story Points
  • Information Radiators
  • Team Velocity
  • Specifications
  • Code Reviews
  • QA
  • Approval processes
  • 100% Test Coverage
It's not that these things don't have a purpose, or aren't useful.  But they are all afflicted with varying degrees of BS (baseless guessing, built in uncertainty, outright lying, and occasionally even complete denial of reality).  

What most of these things have in common is team organization.  A one man team doesn't need this stuff. But any more than one person, and you require some way of keeping everyone on the same page.  Especially if you are building software that all of the teammates don't completely understand.  Without some kind of organization, people would be chasing their own ideas in all different directions.  And since they don't fully understand the "business," those ideas are likely to be wrong (or at least partly wrong).

Thus, teams need a certain amount of BS.  But I think it's important to remember the distinction.  The most important thing to delivering real value is feedback.  Feedback in code.  And feedback in features.  You need the BS, but apply it carefully, and try to keep the BS out of your feedback loops!