Keeping Multiple Values for the Same Hash Key

Keeping Multiple Values for the Same Hash Key


You want to build a hash that might have duplicate values for some keys.


The simplest way is to create a hash that initializes missing values to empty arrays. You can then append items onto the automatically created arrays:

	hash = { |hash, key| hash[key] = [] }

	raw_data = [ [1, 'a'], [1, 'b'], [1, 'c'],
	             [2, 'a'], [2, ['b', 'c']],
	             [3, 'c'] ]
	raw_data.each { |x,y| hash[x] << y }
	# => {1=>["a", "b", "c"], 2=>["a", ["b", "c"]], 3=>["c"]}


A hash maps any given key to only one value, but that value can be an array. This is a common phenomenon when reading data structures from the outside world. For instance, a list of tasks with associated priorities may contain multiple items with the same priority. Simply reading the tasks into a hash keyed on priority would create key collisions, and obliterate all but one task with any given priority.

It's possible to subclass Hash to act like a normal hash until a key collision occurs, and then start keeping an array of values for the key that suffered the collision:

	class MultiValuedHash < Hash
	  def []=(key, value)
	    if has_key?(key)
	      super(key, [value, self[key]].flatten)

	hash =
	raw_data.each { |x,y| hash[x] = y }
	# => {1=>["c", "b", "a"], 2=>["b", "c", "a"], 3=>"c"}

This saves a little bit of memory, but it's harder to write code for this class than for one that always keeps values in an array. There's also no way of knowing whether a value [1,2,3] is a single array value or three numeric values.

See Also

  • Recipe 5.2, "Creating a Hash with a Default Value," explains the technique of the dynamic default value in more detail, and explains why you must initalize the empty list within a code blocknever within the arguments to

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows