Recipe 13.2. Serializing Data with Marshal
Problem
You want to serialize a data structure to disk faster than YAML can do it. You don't care about the readability of the serialized data structure, or portability to other programming languages.
Solution
Use the Marshal module, built into Ruby. It works more or less like YAML, but it's much faster. The Marshal.dump method transforms a data structure into a binary string, which you can write to a file and reconstitute later with Marshal.load.
Marshal.dump(10) # => "\004\010i\017"
Marshal.dump('ten') # => "\004\010\"\010ten"
Marshal.dump('10') # => "\004\010\"\a10"
Marshal.load(Marshal.dump(%w{Brush up your Shakespeare}))
# => ["Brush", "up", "your", "Shakespeare"]
require 'set'
Marshal.load(Marshal.dump(Set.new([1, 2, 3])))
# => #<Set: {1, 2, 3}>
Discussion
Marshal is what most programmers coming from other languages expect from a serializer. It's fast (much faster than yaml), and it produces unreadable blobs of binary data. It can serialize almost anything that yaml can (see Recipe 13.1 for examples), and it can also handle a few cases that yaml can't. For instance, you can use Marshal to serialize a reference to a class:
Marshal.dump(Set) # =>"\004\010c\010Set"
Note that the serialized version of Set is little more than a reference to the class. Like YAML, Marshal depends on the presence of the original classes, and you can't deserialize a reference to a class you don't have. With YAML, you'll get an unresolved YAML::Object; with Marshal, you get an ArgumentError:
#!/usr/bin/ruby -w
Marshal.load("\004\010c\010Set")
# ArgumentError: undefined class/module Set
Like YAML, Marshal only serializes data structures. It can't serialize Ruby code (like Proc objects), or resources allocated by other processes (like filehandles or database connections). However, the two libraries differ in their error handling. YAML tends to serialize as much as it can: it can serialize a File object, but when you deserialize it, you get an object that doesn't point to any actual file.
Marshal just gives you an error when you try to serialize a file:
open('output', 'w') { |f| Marshal.dump(f) }
# TypeError: can't dump File
See Also
 |