ArticleS. UncleBob.
MakingMessesInRuby [add child]

Making Messes in Ruby

When you first learn Ruby you are thrilled with the power of the data types. All those cool arrays and hashes, and the fact that you can put anything in them including other arrays and hashes. You can create wonderfully complex structures loaded with all kinds of cool data.

For example, I've been working with the Harry Potter problem. To solve this you have to arrange, and rearrange the groupings of books until you find the most advantageous grouping. So, for example, if you have five books named (apropriately enough) 1, 2, 3, 4, and 5, then you could group them like this: [[1,2],[3,4],[5]] or you could group them as [[1,2,3,4,5]], or as [[1],[2],[3],[4],[5]], or in any number of a multitude of different ways. In every case please notice that the grouping is represented as an array of arrays.

This doesn't sound like a problem at first; and Ruby makes it very easy to do. However, it can lead to madness (MADNESS I TELL YOU!). Consider:
    def discountedPrice(bag)
minPrice = nil
minAllocation = nil
forEachDiscountAllocation(bag) do |allocation|
price = calculateDiscount(allocation)
if (minPrice == nil || price < minPrice)
minPrice = price
minAllocation = allocation.dup
end
end
minPrice
end


Notice the allocation variable. Is it one of these arrays of arrays? That's the intent; but now look at how this affects the rest of the function. For example, the calculateDiscount(allocation) call; why isn't this allocation.calculateDiscount? Because allocation is an array of arrays, not an object. By using a primitive like this we have broken encapsulation, and coupled our module to the structure of the data.

In a statically typed language this happens too; but it's so explicit that we don't notice it as much. When you declare a variable in C++ to be int[][] allocation, you know it's an array of arrays (..er.. well, sort of..). Indeed, the compiler won't let you deal with it in any other way. But in Ruby this information is hidden, and the context doesn't supply it. And that's exactly the way we want it! That's information hiding.

So we'd rather not know that allocation was an array of arrays. Indeed, we'd rather it NOT be an array of arrays. We'd rather it be an object with methods that hide the internal structure.

For some reason I have found it hard to think this way in Ruby. The primitive types are just SO EASY to sling around. It's only after I've slung them that I realize I've built a house of cards. I find myself looking at a module and wondering just what a particular variable is suposed to contain. Is it an array? Or is it an array of arrays? Or is it an object? Urg...

I think theres a rule here, but I'm not quite sure I understand it fully. I think the rule says something like "Use Objects instead of built-in-types". That rule can't be correct, but it's somthing like that. Maybe it's "Hide anyting more complex than a simple array." No, that's not quite right either.


!commentForm -r
 Fri, 1 Sep 2006 13:23:42, Jim Weirich, It is simple to refactor away from arrays
Bob, I think the rule you are looking for is "Express your Intent". If something is merely a linear ordering of items, then an Array is a good choice. If it is more than that, then define your own object.

However, it is not hard to refactor when your problem outgrows simple arrays. Your choices are:

(1) Add the needed methods to array (Ok, for a small, self contained script, but not a good choice for a general purpose library).

(2) Create a subclass of Array (simple and adaquate for everything but a few corner cases).

(3) Create your own Class and implement just the array methods you are using. Often it is enough to implement each and include Enumerable. Occasionally you might implement [] as well.
 Fri, 1 Sep 2006 07:48:24, Bheeshmar, Is it more complex that it should be?
Since you don't seem to return the minAllocation, and the custom iterator appears to be iterating over allocations couldn't the intent be expressed like this:

prices = bag.map { |allocation| calculateDiscount(allocation) }
prices.min

To address your other issue, the collection as abstraction:
class Allocation < Array
def initialize(array)
super(array)
end
def calculate_discount
calculateDiscount(self)
end
end

I've been interpreting the rule (vis-a-vis collections) as: If the collection represents a concept (an Allocation), then it's a type; if it represents values (prices), it's just a collection. It's close to Tim's suggestion.

Thanks for the great insights!
 Fri, 1 Sep 2006 04:01:35, Jason Gorman, Langauge features steer evolution?
I'm wondering how the design of a programming language like Ruby (or Java, or C# etc) encourages a tendency along certain axes of design quality. Does the lack of static typing, for example, make certain kinds of depencies more likely? Of course, we don't HAVE to do it, but then should we be surprised to find people living nearer fast food outlets to be generally more overweight - even though they don't HAVE to eat it?
 Thu, 31 Aug 2006 23:16:40, Bruno Martínez,
I really like it the way it's now. There's no problem coupling the array representation with this function because they change at the same time.

I think there's to much encapsulation going on sometimes. If there's no invariant to protect I prefer naked, simple types.
 Thu, 31 Aug 2006 22:07:13, Mike, Open Classes
What about utilizing open classes and adding methods to array. I've found this to be an incredibly useful way of doing a variant of rapid prototyping.

In Java, for example, I would never consider using an array or list for anything but the most primitive uses.

With Ruby, I often start by using an array, add appropriate methods as I need them, then eventually, when it becomes clear that it should be an object instead of an array, I make a relatively painless transition.

This way, you get to defer the overhead of creating a new object till later in the process, when you know it will be needed.
 Thu, 31 Aug 2006 18:26:10, Tim Ottinger, When it needs it?

One approximation may be "when it needs self-awareness and volition, it is an object" It at least misses the mark in an interesting way.

I have seen programs written python with the same problems or worse, written like a long pipeline of *nix filters. You had to be constantly researching the transformations of dumb data arrays. It doesn't have to be that way, but modern languages dont force you to abandon primitives as early..