ArticleS
.
UncleBob
.
RubyCombinations
Edit Page:
!title Ruby Combinations I could use a little help with something. Please read on... I taught myself Ruby about 4 years ago, just to see what the language was like. I used it for awhile and then changed focus on Fit, Fitnesse, and Java/.Net. Lately, however, with the rise of Rails, I have returned my focus on Ruby; and I have come to realize what a pleasant and productive language it is. The way to really learn something is through ''Effortful Study''. So I decided to work up a Kata for Ruby that I could practice over and over. I chose the [[Harry Potter kata][http://bossavit.com/cgi-bin/dojo.pl?HarryPotterKata]] from Laurent Bossavit's blog. Part of solving this problem involves an algorithm to generate all possible combinations of a set. I searched the libraries for a module to do combinations but only found a gem that did Permutations. So I decided to write my own. I figured writing a combination iterator test first would be interesting. Here are the tests written using rspec:{{{require 'spec' require 'Combinations' context "Simple Combinations" do specify "degenerate cases" do Combinations.get(0,0).should.equal [] Combinations.get(1,0).should.equal [] Combinations.get(0,1).should.equal [] end specify "nC1" do Combinations.get(1,1).should.equal [[0]] Combinations.get(2,1).should.equal [[0],[1]] end specify "nCn" do Combinations.get(2,2).should.equal [[0,1]] Combinations.get(3,3).should.equal [[0,1,2]] end specify "nCr" do Combinations.get(3,2).should.equal [[0,1],[0,2],[1,2]] Combinations.get(4,3).should.equal [[0,1,2],[0,1,3],[0,2,3],[1,2,3]] Combinations.get(5,3).should.equal [ [0,1,2],[0,1,3],[0,1,4],[0,2,3],[0,2,4],[0,3,4], [1,2,3],[1,2,4],[1,3,4], [2,3,4] ] end end}}} And here is the Combination module that passes those tests: {{{class Combinations def self.c(n,r,&proc) if (n>0 && r>0) combine([], 0, n, r, proc) end end def self.get(n,r) combinations = [] Combinations.c(n,r) {|c| combinations << c} combinations end private def self.combine(combination, s, n, r, proc) if (r == 0) proc.call(combination) else (s..(n-r)).each {|i| combine(combination + [i], i+1, n, r-1, proc)} end end end}}} I am pretty sure that this algorithm is correct and efficient. I was disturbed to find, however, that an earlier version was accurate but horribly inefficient. The difference was: {{{ (s...n).each {|i| combine(combination + [i], i+1, n, r-1, proc)} }}} What bothers me about this is that the tests did not discover the inefficiency. I discovered it later while doing some exploratory manual testing. I suppose I should put in a unit test that ensures a certain minimum efficiency. But I'll leave that for another blog. !3 The ''REAL'' topic of this blog is that I don't like this code. It's cryptic at best. And yet so far I haven't found a way to make it better. My goal is to find a way to present this code so that the algorithm leaps out at you in an unambiguous way. The algorithm I have is just too busy and cluttered to speak for itself. (The measure of this is the amount of time it take YOU to truly understand what it is doing and why it passes the tests.) Any suggestions? !commentForm -r #----- Blog Comment Marker (Please don't delete me) -----# !* Sat, 19 Aug 2006 17:57:44, Matteo Vaccari, cleaning up the "combinations" code These things are best worked out as recursive definitions. Let choose(n, k) be all the ways of choosing k different elements from the set (0...n). Then choose(3, 0) == [[]] choose(3, 1) == [[0], [1], [2]] choose(3, 2) == [[0,1], [0,2], [1,2]] etc. We also have choose(3, 4) == [] since there is no way of choosing 4 different elements from a set of just 3 elements. So let's write a recursive definition for choose(n, k); the base case is choose(0, 0) == [[]] choose(0, k) == [] if k > 0 this covers all cases when n == 0. Now let's look at, say, n==3. choose(3, 1) == [[0], [1], [2]] choose(3, 2) == [[0,1], [0,2], [1,2]] what happens when we get to n==4? choose(4, 2) == [[0,1], [0,2], [1,2], [0,3], [1,3], [2,3]] Cool! It looks like the first half elements are the same as choose(3,2) choose(4, 2) == choose(3, 2) + [[0,3], [1,3], [2,3]] The remaining elements are the same as choose(3,1) with the new element 3 appended. choose(4, 2) == choose(3, 2) + append_all(choose(3,1), 3) It turns out that this is a general pattern. The ways to choose k elements out of a set of (n+1) elements are: - all of the ways to choose k elements from a set of n elements, plus - all the ways to add the new element to the choices of k-1 old elements Test first! def test_base assert_equal [[]], choose(3, 0) assert_equal [], choose(0, 3) end def test_step # choose(1,1) == choose(0, 1) + append_all(choose(0, 0), 0) # == [] + append_all([[]], 0) # == [[0]] assert_equal [[0]], choose(1, 1) assert_equal [[0,1], [0,2], [1,2]], choose(3, 2) assert_equal [[0,1], [0,2], [1,2], [0,3], [1,3], [2,3]], choose(4, 2) assert_equal [[0,1,2,3]], choose(4, 4) end The code that passes the tests is def choose(n, k) return [[]] if n == 0 && k == 0 return [] if n == 0 && k > 0 return [[]] if n > 0 && k == 0 new_element = n-1 choose(n-1, k) + append_all(choose(n-1, k-1), new_element) end def append_all(lists, element) lists.map { |l| l << element } end Since we have a recursive call with k-1, we had to add a clause to specify what happens when k==0. This code is certainly concise. It is also clear, as long as you understand why the recursive definition makes sense. *! !* Sat, 19 Aug 2006 10:23:39, Dean Wampler, A more "literate" variation? Here's a variation of the original rspec test that attempts to wrap invocations of Combinatios.get() in a more literate format, using a global method: {{{ require 'spec' require 'Combinations' def get_combinations args Combinations.get args[:for_n_items], args[:sets_of] end context "Simple Combinations" do specify "degenerate cases" do get_combinations(:sets_of => 0, :for_n_items => 0).should.equal [] get_combinations(:sets_of => 0, :for_n_items => 1).should.equal [] get_combinations(:sets_of => 1, :for_n_items => 0).should.equal [] end specify "nC1" do get_combinations(:sets_of => 1, :for_n_items => 1).should.equal [[0]] get_combinations(:sets_of => 1, :for_n_items => 2).should.equal [[0],[1]] end specify "nCn" do get_combinations(:sets_of => 2, :for_n_items => 2).should.equal [[0,1]] get_combinations(:sets_of => 3, :for_n_items => 3).should.equal [[0,1,2]] end specify "nCr" do get_combinations(:sets_of => 2, :for_n_items => 3).should.equal [[0,1],[0,2],[1,2]] get_combinations(:sets_of => 3, :for_n_items => 4).should.equal [[0,1,2],[0,1,3],[0,2,3],[1,2,3]] get_combinations(:sets_of => 3, :for_n_items => 5).should.equal [ [0,1,2],[0,1,3],[0,1,4],[0,2,3],[0,2,4],[0,3,4], [1,2,3],[1,2,4],[1,3,4], [2,3,4] ] end end }}} It's a little verbose for frequent use, but it's optional and reads more easily for the first-time reader. *! !* Fri, 18 Aug 2006 14:17:31, , |John Roth said: ''A bit of analysis shows that total cost is minimized when each of the result packages contains the maximum number of items. The easy way to do this is to take one item from each non-empty stack and shove it into a package. Repeat until all stacks are empty. The result will contain 0 or more packages with all five books, zero or more packages with the same four books, zero or more packages with the same three books, etc.''| I'd love to see the proof of that. Indeed, I'm not quite sure I understand the algorithm you are getting at. One of Laurent's tests is 0,0,1,1,2,2,3,4 We could divide this into a one group of 5 and one group of three, or we could divide it into two groups of four. Which of those satifies your "each package continas the maximum number of items" criterion? The second grouping has a price of 6.4 whereas the first has a price of 6.45. In any case I have to confess that I have modfied Laurent's problem a bit to allow different prices for each book. So, regardless of the best algorithm for Laurent's problem, the problem I am shooting at DOES require a full combinatorial search. *! !* Fri, 18 Aug 2006 12:40:00, DaveHoover, twiddling I twiddled the code a bit, trying to remove some noise and ended up with this. The algorithm still doesn't leap out at me, though. {{{class Combinations class << self def get(n, r) return [] unless n > 0 and r > 0 combine(n, r) end private def combine(n, r, s=0, combination=[], collector=[]) if r.zero? collector << combination else for i in s..(n-r) combine(n, r-1, i+1, combination+[i], collector) end end collector end end end}}} *! !* Fri, 18 Aug 2006 06:02:57, JakeB, forgot the unit tests to the code below {{{ import static kata1.HarryPotterKata.*; import java.util.ArrayList; import java.util.Arrays; import junit.framework.TestCase; public class HarryPotterKataTests extends TestCase { protected static final int[] mediumItemList1 = new int[] { 0, 1, 1, 2, 3, 4 }; protected static final int[] complexItemList2 = new int[] { 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3, 3, 4, 4, 4, 4 }; protected static final int[] complexItemList1 = new int[] { 0, 0, 1, 1, 2, 2, 3, 4 }; public void testBasicCases() { assertEquals(0.0, totalPrice(new int[] {})); assertEquals(8.0, totalPrice(new int[] { 0 })); assertEquals(8.0, totalPrice(new int[] { 1 })); assertEquals(8.0, totalPrice(new int[] { 2 })); assertEquals(8.0, totalPrice(new int[] { 3 })); assertEquals(8.0, totalPrice(new int[] { 4 })); assertEquals(8.0 * 2, totalPrice(new int[] { 0, 0 })); assertEquals(8.0 * 3, totalPrice(new int[] { 1, 1, 1 })); } public void testSimpleDiscounts() { assertEquals(8 * 2 * 0.95, totalPrice(new int[] { 0, 1 })); assertEquals(8 * 3 * 0.9, totalPrice(new int[] { 0, 2, 4 })); assertEquals(8 * 4 * 0.8, totalPrice(new int[] { 0, 1, 2, 4 })); assertEquals(8 * 5 * 0.75, totalPrice(new int[] { 0, 1, 2, 3, 4 })); } public void testSeveralDiscountsCombined() { assertEquals(8 + (8 * 2 * 0.95), totalPrice(new int[] { 0, 0, 1 })); assertEquals(2 * (8 * 2 * 0.95), totalPrice(new int[] { 0, 0, 1, 1 })); assertEquals((8 * 4 * 0.8) + (8 * 2 * 0.95), totalPrice(new int[] { 0, 0, 1, 2, 2, 3 })); assertEquals(8 + (8 * 5 * 0.75), totalPrice(mediumItemList1)); } public void testViciousCases() { assertEquals(2 * (8 * 4 * 0.8), totalPrice(complexItemList1)); assertEquals(3 * (8 * 5 * 0.75) + 2 * (8 * 4 * 0.8), totalPrice(complexItemList2)); } public void testGetNumberOfDifferentVolumesOnList() { assertEquals(0, getNumberOfDifferentVolumesOn(new int[] {})); assertEquals(1, getNumberOfDifferentVolumesOn(new int[] { 0 })); assertEquals(2, getNumberOfDifferentVolumesOn(new int[] { 0, 1 })); assertEquals(2, getNumberOfDifferentVolumesOn(new int[] { 0, 0, 1 })); } public void testDiscountNumbers() { assertEquals(0.0 , getDiscountOn(0)); assertEquals(0.0 , getDiscountOn(1)); assertEquals(0.05 , getDiscountOn(2)); assertEquals(0.10 , getDiscountOn(3)); assertEquals(0.20 , getDiscountOn(4)); assertEquals(0.25 , getDiscountOn(5)); } public void testGetPartitionsOfItemListReturnsATruePartition() { int[][] partitions = getPartitionsOf(mediumItemList1); int[] joinedList = concat(partitions); assertEquals("length should have been preserved", mediumItemList1.length, joinedList.length); assertArraysEqual(getProfileOf(mediumItemList1), getProfileOf(joinedList)); } public void testGetPartitionsOfItemListReturnsShrinkingGroups() { int[][] partitions = getPartitionsOf(mediumItemList1); int sizeOfLastPartition = Integer.MAX_VALUE; for (int[] partition : partitions) { int sizeOfThisPartition = partition.length; assertTrue("size of partitions should be shrinking", sizeOfThisPartition < sizeOfLastPartition); sizeOfLastPartition = sizeOfThisPartition; } } public void testGetPartitionsOfItemListReturnsAtMostOneOfEachVolumeInEachPartition() { int[][] partitions = getPartitionsOf(mediumItemList1); for (int[] partition : partitions) { int[] partitionProfile = getProfileOf(partition); for (int volCount : partitionProfile) { assertTrue("at most one of each volume in each partition", volCount <= 1); } } } public void testGetOneOfEachVolumeFromTheNonZeroProfiles() { assertArraysEqual(new int[] {}, getOneOfEachVolumeFromTheNonZeroProfiles(new int[] {0,0,0,0,0})); assertArraysEqual(new int[] {1}, getOneOfEachVolumeFromTheNonZeroProfiles(new int[] {0,1,0,0,0})); assertArraysEqual(new int[] {1, 3}, getOneOfEachVolumeFromTheNonZeroProfiles(new int[] {0,1,0,2,0})); assertArraysEqual(new int[] {0, 1, 2, 3, 4}, getOneOfEachVolumeFromTheNonZeroProfiles(new int[] {4,1,5,2,2})); } public void testDecrementProfile() { assertArraysEqual(new int[] {0,0,0,0,0}, decrementProfile(new int[] {0,0,0,0,0})); assertArraysEqual(new int[] {0,0,0,0,0}, decrementProfile(new int[] {0,1,0,0,0})); assertArraysEqual(new int[] {0,1,0,0,0}, decrementProfile(new int[] {0,2,1,0,0})); assertArraysEqual(new int[] {1,1,2,0,0}, decrementProfile(new int[] {2,2,3,0,0})); } public void testGetItemListProfile() { assertArraysEqual(new int[] {0,0,0,0,0}, getProfileOf(new int[] {})); assertArraysEqual(new int[] {1,2,1,1,1}, getProfileOf(mediumItemList1)); } private void assertArraysEqual(int[] a1, int[] a2) { assertTrue(Arrays.equals(a1, a2)); } private int[] concat(int[][] partitions) { ArrayList<Integer> list = new ArrayList<Integer>(); for (int[] partition : partitions) { for (int item : partition) { list.add(item); } } int[] concatArray = new int[list.size()]; for (int i = 0; i < list.size(); i++) { concatArray[i] = list.get(i); } return concatArray; } } }}} *! !* Fri, 18 Aug 2006 05:42:02, JakeB, is John's algorithm correct? I have implemented the algorithm that John has suggested (code below), and it fails the vicious tests. For [ 0, 0, 1, 1, 2, 2, 3, 4 ], the cheapest partitioning is not {[0,1,2,3,4], [0,1,2]}, but in fact {[0,1,2,3],[0,1,2,4]}. {{{ import java.util.LinkedList; import java.util.List; public class HarryPotterKata { public static final int numberOfDifferentVolumes = 5; static int singleBookPrice = 8; public static double totalPrice(int[] itemList) { int[][] itemGroups = getPartitionsOf(itemList); double sum = 0; for (int[] itemGroup : itemGroups) { sum += calculateDiscountedPriceOn(itemGroup); } return sum; } protected static double calculateDiscountedPriceOn(int[] itemList) { int priceWithoutDiscounts = getPriceWithoutDiscounts(itemList); int numberOfDifferentVolumes = getNumberOfDifferentVolumesOn(itemList); double discountedPrice = getDiscountedPrice(priceWithoutDiscounts, numberOfDifferentVolumes); return discountedPrice; } public static int[][] getPartitionsOf(int[] itemList) { int[] profile = getProfileOf(itemList); int numberOfPartitions = max(profile); int[][] partitions = new int[numberOfPartitions][]; for (int i = 0; i < numberOfPartitions; i++) { partitions[i] = getOneOfEachVolumeFromTheNonZeroProfiles(profile); profile = decrementProfile(profile); } return partitions; } public static int[] decrementProfile(int[] profile) { for (int i = 0; i < profile.length; i++) { if (profile[i] > 0) profile[i]--; } return profile; } public static int[] getOneOfEachVolumeFromTheNonZeroProfiles(int[] profile) { List<Integer> list = new LinkedList<Integer>(); for (int i = 0; i < profile.length; i++) { if (profile[i] != 0) list.add(i); } int[] group = new int[list.size()]; for (int i = 0; i < group.length; i++) { group[i] = list.get(i); } return group; } private static int max(int[] profile) { int max = Integer.MIN_VALUE; for (int value : profile) { if (value > max) max = value; } return max; } protected static double getDiscountedPrice(int priceWithoutDiscounts, int numberOfDifferentVolumes) { if (numberOfDifferentVolumes >=2 && numberOfDifferentVolumes <=5) { double discount = getDiscountOn(numberOfDifferentVolumes); return priceWithoutDiscounts * (1 - discount); } else { return priceWithoutDiscounts; } } public static int getNumberOfDifferentVolumesOn(int[] itemList) { int[] itemListProfile = getProfileOf(itemList); int numberOfDifferentVolumesOnList = 0; for (int volumeNumber : itemListProfile) { if (volumeNumber != 0) numberOfDifferentVolumesOnList++; } return numberOfDifferentVolumesOnList; } protected static int getPriceWithoutDiscounts(int[] itemList) { return itemList.length * singleBookPrice; } public static double getDiscountOn(int numberOfDifferentVolumes) { switch (numberOfDifferentVolumes) { case 2 : return 0.05; case 3 : return 0.10; case 4 : return 0.20; case 5 : return 0.25; default: return 0.0; } } public static int[] getProfileOf(int[] itemList) { int[] itemListProfile = new int[numberOfDifferentVolumes]; for (int item : itemList) { itemListProfile[item]++; } return itemListProfile; } } }}} *! !* Thu, 17 Aug 2006 19:36:46, John Roth, You're right, it's not obvious. But before we get to that, I've got a different issue. I don't see what the combinations algorithm has to do with the problem! Let me restate Laurent's problem. You have five stacks each of which has zero or more items. The list price of each item is the same (8 euros). The problem is to produce "packages" (presumably wrapped in gaily colored paper) of items from the stacks, with each package containing at most one item from each stack, which minimize the total cost, after applying discounts. A bit of analysis shows that total cost is minimized when each of the result packages contains the maximum number of items. The easy way to do this is to take one item from each non-empty stack and shove it into a package. Repeat until all stacks are empty. The result will contain 0 or more packages with all five books, zero or more packages with the same four books, zero or more packages with the same three books, etc. There's no need to do an exhaustive search of all possible combinations. I'll take another look at what you're doing and see if I can figure it out, but the code doesn't seem to have anything to do with the stated problem. John Roth *! !* Thu, 17 Aug 2006 17:45:18, ${unclebob}, Private class methods. Ack. That's so wierd. I get the whole singleton class concept, but usages like that are still not obvious to me. *! !* Thu, 17 Aug 2006 17:42:52, ${unclebob}, BDD Kool-aid I find the BDD notation intriguing and want to get some experience with it. So far I'm not finding it all that different from test/unit; but that's a topic for a different blog. *! !* Thu, 17 Aug 2006 17:30:12, ChrisGardner, RSpec Not a suggestion for improvement, I'm afraid, but a question. I notice you are using rspec. Is this something you are experimenting with or have you drunk the BDD Kool-Aid? (not that there's anything wrong with that--just curious) *! !* Thu, 17 Aug 2006 16:07:00, DaveHoover, small tweak I'm glad that you're getting into Ruby and Rails, and I'm going to ponder the REAL topic of this blog post for a while. I have one minor correction for your solution in the meantime... private def self.combine(... This doesn't actually make that class method private. You can still call Combinations.combine from outside the class. If you want to make it private, you'll need to open the singleton class explicitly... class << self private def combine(... ... end end *!
Hints:
Use alt+s (Windows) or control+s (Mac OS X) to save your changes. Or, tab from the text area to the "Save" button!
Grab the lower-right corner of the text area to increase its size (works with some browsers).