Computing Set Operations on Arrays
Problem
You want to find the union, intersection, difference, or Cartesian product of two arrays, or the complement of a single array with respect to some universe.
Solution
Array objects have overloaded arithmetic and logical operators to provide the three simplest
set operations:
#Union
[1,2,3]  [1,4,5] # => [1, 2, 3, 4, 5]
#Intersection
[1,2,3] & [1,4,5] # => [1]
#Difference
[1,2,3]  [1,4,5] # => [2, 3]
Set objects overload the same operators, as well as the exclusiveor operator (^).If you already have Arrays, though, it's more efficient to deconstruct the XOR operation into its three component operations.
require 'set'
a = [1,2,3]
b = [3,4,5]
a.to_set ^ b.to_set # => #<Set: {5, 1, 2, 4}>
(a  b)  (a & b) # => [1, 2, 4, 5]
Discussion
Set objects are intended to model mathematical sets: where arrays are ordered and can contain duplicate entries, Sets model an unordered collection of unique items. Set not only overrides operators for
set operations, it provides Englishlanguage aliases for the three most common operators:
Set#union,
Set#intersection, and
Set#difference. An array can only perform a set operation on another array, but a Set can perform a set operation on any Enumerable.
array = [1,2,3]
set = [3,4,5].to_s
array & set # => TypeError: can't convert Set into Array
set & array # => #<Set: {3}>
You might think that Set objects would be optimized for
set operations, but they're actually optimized for constanttime membership checks (internally, a Set is based on a hash). Set union is faster when the lefthand object is a Set object, but intersection and difference are significantly faster when both objects are arrays. It's not worth it to convert arrays into Sets just so you can say you performed set operations on Set objects.
The union and intersection set operations remove duplicate entries from arrays. The difference operation does not remove duplicate entries from an array except as part of a subtraction.
[3,3] & [3,3] # => [3]
[3,3]  [3,3] # => [3]
[1,2,3,3]  [1] # => [2, 3, 3]
[1,2,3,3]  [3] # => [1, 2]
[1,2,3,3]  [2,2,3] # => [1]
Complement
If you want the complement of an array with respect to some small universe, create that universe and use the difference operation:
u = [:red, :orange, :yellow, :green, :blue, :indigo, :violet]
a = [:red, :blue]
u  a # => [:orange, :yellow, :green, :indigo, :violet]
More often, the relevant universe is infinite (the
set of natural numbers)or extremely large (the set of threeletter strings). The best strategy here is to define a generator and use it to iterate through the complement. Be sure to break when you're done; you don't want to iterate over an infinite set.
def natural_numbers_except(exclude)
exclude_map = {}
exclude.each { x exclude_map[x] = true }
x = 1
while true
yield x unless exclude_map[x]
x = x.succ
end
end
natural_numbers_except([2,3,6,7]) do x
break if x > 10
puts x
end
# 1
# 4
# 5
# 8
# 9
# 10
Cartesian product
To get the Cartesian product of two arrays, write a nested iteration over both lists and append each pair of items to a new array. This code is attached to Enumerable so you can also use it with Sets or any other Enumerable.
module Enumberable
def cartesian(other)
res = []
each { x other.each { y res << [x, y] } }
return res
end
end
[1,2,3].cartesian(["a",5,6])
# => [[1, "a"], [1, 5], [1, 6],
# [2, "a"], [2, 5], [2, 6],
# [3, "a"], [3, 5], [3, 6]
This version uses Enumerable#inject to make the code more concise; however, the original version is more efficient.
module Enumerable
def cartesian(other)
inject([]) { res, x other.inject(res) { res, y res << [x,y] } }
end
end
See Also
See Recipe 2.5, "Generating Random Numbers," for an example (constructing a deck of cards from suits and ranks)that could benefit from a function to calculate the Cartesian product Recipe 2.10, "Multiplying Matrices"
