May 11, 2011, 6:39 p.m.
posted by un
Enhancements and Alternatives
Perhaps I flatter myself, but I tend to think the utilities developed in this chapter are fairly comprehensive. However, there are always a few enhancements we can make.
Additional Data Types
The data types that come to my mind first are real legacy types associated with IBM mainframes. For example, if we had alphanumeric fields in EBCDIC rather than ASCII encoding, and would want to do EBCDIC to ASCII conversion. If we did this we would probably also need to support the associated numeric data types such as zoned decimal and packed decimal. There might be a few wrinkles we would have to consider in regard to specifying and retrieving record tags. However, for the most part we again would need to code only the DataCell derived classes, change the RecordHandler's createDataCell method, and add the enumerations for the data types to the BBCommonFileDescription.xsd schema.
CSV Record Formats
We discussed this a bit at the end of the Chapter 7. If you are following the architecture by now, it should be fairly obvious that the FlatSourceConverter and FlatTargetConverter could easily use the CSVRecordWriter and CSVRecord Reader instead of the flat file versions we implemented in this chapter. We would need to add or modify a few methods here and there in the CSV record handlers to properly handle record identifiers, but that is about all.
Rounding versus Truncation
I said earlier in the chapter that for numeric fields that allowed truncation of excess fractional digits we would truncate rather than round. However, I realize that there may be business cases for rounding, and doing the rounding in XSLT may not necessarily be the most elegant solution. To add rounding we would need to modify the prepareOutput methods of the DataCellReal and DataCellN classes. To avoid the nastiness of floating point arithmetic I would probably split the source number into an integer and fractional digits. Then I would treat the fractional digits as an integer, rounding up at the appropriate decimal position, truncating, then concatenating the remaining fractional digits to the integer portion on the left of the decimal.
Many legacy applications have what is similar to a GROUP field in COBOL, that is, a structure of two or more fields grouped together that may repeat. To support these with the current implementation each member field of each group must be specifically defined in the file description documents. Doing this loses any notion of a group. Supporting this type of functionality is fairly low on my priority list, but it could be accomplished using a strategy similar to how we'll handle composite data structures in EDI in Chapter 9.
COBOL specifically supports a REDEFINES clause, similar to a union in C. I'm not sure of the advantages of supporting this feature, but I'm also sure that someone might think of some. For example, supporting REDEFINES might be attractive to people who want to develop file description documents directly from COBOL copybooks. I've not tested it but I see no reason why the FlatToXML converter couldn't support redefined fields right now. The XMLToFlat converter might also support the feature, but if it does it would, in essence, write out and overlay the redefined bytes once for each redefinition. I see no advantage in that!