Manipulating Files Through Streams

Manipulating Files Through Streams

The .NET Framework includes a new object-oriented approach to reading and writing files: streams. The abstract Stream object, found at System.IO.Stream, defines a generic interface to a chunk of data. It doesn't matter where that data is: In a file, in a block of memory, in a String variable. If you have a block of data that can be read or written one byte at a time, you can design a derived stream class to interact with it.

Stream Features

The basic features of a Stream object include the Read and Write methods that let you read or write bytes. As data is read from or written to a stream, the Stream object maintains a "current position" within the stream that you can adjust using the Seek method, or examine using the Position property. The Length property indicates the size of the readable data. The class also exposes variations of these basic features to allow as much flexibility as possible.

Not every stream supports all features. Some streams are read-only, forward-only constructs that don't support writing or seeking. Other streams support all possible features. The features available to you depend on the type of stream you use. Because Stream itself is abstract, you must create an instance of one of its derived classes. .NET defines several useful streams ready for your use:

  • FileStream. The FileStream object lets you access the content of a file using the basic methods of the generic Stream class. FileStream objects support reading, writing, and seeking, although if you open a read-only file, you won't be able to write to it.

  • MemoryStream. A stream based on a block of raw memory. You can create a memory stream of any size, and use it to temporarily store and retrieve any data.

  • NetworkStream. This class abstracts data coming over a network socket. Whereas most of the derived stream classes reside in System.IO, this class sits in System.Net.Sockets.

  • BufferedStream. Adds buffering support to a stream to improve performance on streams with latency issues. You wrap a BufferedStream object around another stream to use it.

  • CryptoStream. This stream allows you to attach a cryptographic service provider to it, resulting in encrypted output from plain input, or vice versa. Chapter 11, "Security," includes examples that use this type of stream.

  • DeflateStream and GZipStream. Lets you use a stream to compress or decompress data as it is processed, all using standard compression algorithms.

Streams are useful on their own, but you can also combine streams so that an incoming network stream can be immediately encrypted, compressed, and stored in a block of stream memory.

Using a Stream

Using a stream is simple; first you create it, and then you start reading and writing bytes left and right. Here's some sample code I wrote that moves data into and out of a memory stream. It's loosely based on the code you'll find in the MSDN documentation for the MemoryStream class.

' ----- The Stream, or There and Back Again.
Dim position As Integer
Dim memStream As IO.MemoryStream
Dim sourceChars As Byte()
Dim destBytes As Byte()
Dim destChars As Char()
Dim asUnicode As New System.Text.UnicodeEncoding()

' ----- Create a memory stream with room for 100 bytes.
memStream = New IO.MemoryStream(100)

' ----- Convert the text data to a byte array.
sourceChars = asUnicode.GetBytes( _
   "This is a test of the emergency programming system.")

   ' ----- Store the byte-converted data in the stream.
   memStream.Write(sourceChars, 0, sourceChars.Length)

   '  ----- The position is at the end of the written data.
   '        To read it back, we must move the pointer to
   '        the start again.
   memStream.Seek(0, IO.SeekOrigin.Begin)

   ' ----- Read a chunk of the text/bytes at once.
   destBytes = New Byte(CInt(memStream.Length)) {}
   position = memStream.Read(destBytes, 0, 25)

   ' ----- Get the remaining data one byte at a time,
   '       just for fun.
   While (position < memStream.Length)
      destBytes(position) = CByte(memStream.ReadByte())
      position += 1
   End While

   ' ----- Convert the byte array back to a set of characters.
   destChars = New Char(asUnicode.GetCharCount( _
      destBytes, 0, position)) {}
   asUnicode.GetDecoder().GetChars(destBytes, 0, _
      position, destChars, 0)

   ' ----- Prove that the text is back.
End Try

The comments hopefully make the code clear. After creating a memory stream, I push a block of text into it, and then read it back out. (The text stays in the stream; reading it did not remove it.) Actually, the stream code is pretty simple. Most of the code deals with conversions between bytes and characters. If it looks overly involved, that's because it is.

Beyond Stream Bytes

For me, all that converting between bytes and characters is for the birds. When I write business applications, I typically deal in dates, numbers, and strings: customer names, order dates, payment amounts, and so on. I rarely have a need to work at the byte level. I sure wish there was a way to send this byte stuff down a programming stream of its own so I wouldn't have to see it anymore.

Lucky me! .NET makes some wishes come true. Although you can manipulate streams directly if you really want to or need to, the System.IO namespace also includes several classes that provide a more programmer-friendly buffer between you and the stream. These classesimplemented as distinct readers and writers of stream dataprovide simplified methods of storing specific data types, and retrieving them back again.

The readers and writers are designed for single-direction start-to-finish processing of data. After creating or accessing a stream, you wrap that stream with either a reader or a writer, and begin traversing the extent of the stream from the beginning. You always have access to the underlying stream if you need more fine-tuned control at any point.

There are three main pairs of readers and writers:

  • BinaryReader and BinaryWriter. These classes make it easy to write and later read the core Visual Basic data types to and from a (generally) non-text stream. The BinaryWriter.Write method includes overloads for writing Bytes, Chars, signed and unsigned integers of various sizes, Booleans, Decimals and Doubles, Strings, and arrays and blocks of Bytes and Chars. Curiously missing is an overload for Date values.

    The BinaryReader counterpart includes separate Read methods for each of the writable data types. The ReadDouble method returns a Double value from the stream, and there are similar methods for the other data types.

  • StreamReader and StreamWriter. These classes are typically used to process line-based text files. The StreamReader class includes a ReadLine method that returns the next text line in the incoming stream as a standard String. The related StreamWriter.Write method includes all the overloads of BinaryWriter.Write, and also has a version that lets you format a string for output. The reader includes features that let you read data one character at a time, a block at a time, or an entire file at a time.

  • StringReader and StringWriter. This pair of classes provides the same features as the StreamReader and StreamWriter, but uses a standard String instance for data storage instead of a true Stream.

There is one additional pairTextReader and TextWriterthat provides the base class for the other non-binary readers and writers. You can't create instances of them directly, but they do let you treat the stream and string versions of the readers and writers generically.

With these new tools, it's easier to process non-Byte data through streams. Here's a rewrite of the simple memory stream code I wrote earlier, adjusted to use a StreamReader and StreamWriter.

' ----- The Stream, or There and Back Again.
Dim memStream As IO.MemoryStream
Dim forWriting As IO.StreamWriter
Dim forReading As IO.StreamReader
Dim finalMessage As String
Dim asUnicode As New System.Text.UnicodeEncoding()

' ----- Create a memory stream with room for 100 bytes.
memStream = New IO.MemoryStream(100)

   ' ----- Wrap the stream with a writer.
   forWriting = New IO.StreamWriter(memStream, asUnicode)

   ' ----- Store the original data in the stream.
   forWriting.WriteLine( _
      "This is a test of the emergency programming system.")

   ' ----- The position is at the end of the written data.
   '       To read it back, we must move the pointer to
   '       the start again.
   memStream.Seek(0, IO.SeekOrigin.Begin)

   ' ----- Create a reader to get the data back again.
   forReading = New IO.StreamReader(memStream, asUnicode)

   ' ----- Get the original string.
   finalMessage = forReading.ReadToEnd()

   ' ----- Prove that the text is back.
End Try

That code sure is a lot nicer without all of that conversion code cluttering up the works. (It could be simplified even more by leaving out all of the optional Unicode encoding stuff.) Of course, everything is still being converted to bytes under the surface; the memory stream only knows about bytes. But StreamWriter and StreamReader take that burden away from us, performing all of the messy conversions on our behalf.

Reading a File via a Stream

Most Stream processing involves files, so let's use a StreamReader to process a text file. Although we already decided in Chapter 14, "Application Settings," that INI files are a thing of the past, it might be fun to write a routine that extracts a value from a legacy INI file. Consider a file containing this text.




Now there's something you don't see everyday, and with good reason! Still, if we wanted to get the value for Key2 in section Section1 (the "jkl" value), we would have to fall back on the GetPrivateProfileString API call from those bad old pre-.NET programming days. Or, we could implement a StreamReader in a custom function all our own.

Public Function GetINIValue(ByVal sectionName As String, _
      ByVal keyName As String, ByVal iniFile As String) _
      As String
   ' ----- Given a section and key name for an INI file,
   '       return the matching value entry.
   Dim readINI As IO.StreamReader
   Dim oneLine As String
   Dim compare As String
   Dim found As Boolean

   On Error GoTo ErrorHandler

   ' ----- Open the file.
   If (My.Computer.FileSystem.FileExists(iniFile) = False) _
      Then Return ""
   readINI = New IO.StreamReader(iniFile)

   ' ----- Look for the matching section.
   found = False
   compare = "[" & Trim(UCase(sectionName)) & "]"
   Do While (readINI.EndOfStream = False)
      oneLine = readINI.ReadLine()
      If (Trim(UCase(oneLine)) = compare) Then
         ' ----- Found the matching section.
         found = True
         Exit Do
      End If

   ' ----- Exit early if the section name was not found.
   If (found = False) Then
      Return ""
   End If

   ' ----- Look for the matching key.
   compare = Trim(UCase(keyName))
   Do While (readINI.EndOfStream = False)
      ' ----- If we reach another section, then the
      '       key wasn't there.
      oneLine = Trim(readINI.ReadLine())
      If (Len(oneLine) = 0) Then Continue Do
      If (oneLine.Substring(0, 1) = "[") Then Exit Do

      ' ----- Ignore lines without an "=" sign.
      If (InStr(oneLine, "=") = 0) Then Continue Do
      ' ----- See if we found the key. By the way, I'm
      '       using Substring() instead of Left() so
      '       I don't have to worry about conflicts with
      '       Form.Left in case I drop this routine into
      '       a Form class.
      If (Trim(UCase(oneLine.Substring(0, _
            InStr(oneLine, "=") - 1))) = compare) Then
         ' ----- Found the matching key.
         Return Trim(Mid(oneLine, InStr(oneLine, "=") + 1))
      End If

   ' ----- If we got this far, then the key was missing.
   Return ""

   ' ----- Return an empty string on any error.
   On Error Resume Next
   If (readINI IsNot Nothing) Then readINI.Close()
   readINI = Nothing
   Return ""
End Function

This routine isn't an exact replacement for GetPrivateProfileString; it doesn't support a default return value, or perform file caching for speed. You could improve the routine with better error handling. But it does retrieve the value we seek, and it does it by reading the INI file one line at a time through a StreamReader.

MsgBox(GetINIValue("Section1", "Key2", iniFilePath))
   ' ----- Displays 'jkl'

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows