Data Warehouse and BI Documentation

Data Warehouse and BI Documentation

We all seem to skimp on documentation in the run up to system deployment. It seems as though the business should be able to use the system without a ton of documentation. After all, we spent a lot of time and trouble organizing and naming things in a sensible way. The bad news here is that the team needs to do a lot of documentation of the system in order to offer a complete solution to the users. The good news is most of the documentation is really metadata dressed up in presentable clothes. If you’ve been capturing metadata all along, much of the job now is to create a nice front end for users to access that metadata. If you’ve been ignoring the metadata issue, you’ve got a lot of work ahead of you.

As we detail in Chapters 8 and 9, the BI portal is the organization’s single source for reporting and analysis and associated information. The main content of the BI portal will be the navigation hierarchy and the standard reports contained therein. Around the edges of the main BI portal page, users should find links to all the documentation and tools described here.

Core Descriptions

The first things to document are the data: the business process subject areas including facts and dimensions, and the tables, columns, calculations, and other rules that make up those subject areas. Standard reports and other BI applications should also be documented, though their documentation is often integrated with the reports themselves.

Business Process Dimensional Model Descriptions

The starting point for most BI documentation is the business process dimensional model. The DW/BI team must write a clear, succinct description of each dimensional model in the warehouse. This document will be the starting point for anyone who wants to understand what’s in the DW/BI system. If Orders was the initial row selected on the bus matrix, write a document that describes the Orders dimensional model. This document answers such questions as:

  • What’s the nature of the business process captured in this data?

  • What are the salient business rules?

  • What’s the grain of the fact table?

  • What date range is included in the fact table?

  • What data has been left out (and why)?

  • What dimensions participate in this business process? Many of the dimensions will need their own descriptive documents that this document can link to.

This document should have a few screen captures that show the target dimensional model in a graphical form (like a data model), some example values, and a few reports to demonstrate the kinds of business questions it can address.

Table and Column Descriptions

Once people have a general understanding of a particular schema, they need to be able to drill down into the details, table by table and column by column. This is where the descriptive metadata you captured when you were building the initial target model comes back into service. Figure illustrates this descriptive metadata. The table name, column names, descriptions, comments, and sample values would all be helpful to a user trying to understand the contents of the table in question. Chapter 13 describes a simple data model for pulling this descriptive metadata together in a form that can be the basis for a set of user reports.

If you’re using a third-party front-end tool, you’ll need to integrate your descriptive metadata into the tool’s metadata layer. This typically involves using the tool’s administrative client to map the table and column descriptions into the tool’s metadata structures. In order to take full advantage of the tool’s metadata, plan to create additional entries that define the business process dimensional model, role-playing tables, join paths, and perhaps even calculated fields.

Report Descriptions

Each report must have a base set of descriptive information as part of the standard template described in Chapter 8. Some of this information, like the report title and description, can be written out to the Reporting Services metadata structures when the reports are built or updated. Other information will need to be captured in the metadata repository described in Chapter 13. In particular, the navigation framework described in Chapter 9, and the assignment of individual reports to categories and groups, help people understand what information is available. These category assignments should follow the same organizing framework used to present the reports in the BI portal. In fact, this metadata could be used to dynamically create the portal interface.

Additional Documentation

Data and report documentation are certainly the most commonly used, but other documentation is also important. The most valuable additional documentation comes in the form of online tutorials, support guides, and a list of colleagues who use the system and may be able to help.

As we discuss later in this chapter, you should develop and deliver training to the business users. This training should be mandatory for business users who’ll be creating ad hoc queries, but it’s useful for everyone. Realistically, not all users will come to a class. Even if you do have 100 percent attendance, users can benefit from online tutorials and class materials. These may be as simple as an annotated version of the classroom materials made available on the web site. Some of the front end tool vendors offer specialized software to develop and deliver online training materials. These systems guide people through the materials, include integrated exercises and self-tests. They also allow you to track who has taken what training, how far they got, and how well they did. While this is more work, it may be worth it in a large, distributed organization.

A support guide will help your business users know whom to call when they have a problem. You should list the escalation hierarchy with contact names, emails, and phone numbers. You may get significant leverage out of publishing a list of frequently asked questions and answers. We discuss user support issues later in this chapter.

Knowing who else is trained can guide users to people close by who may be able to help them. Keep a current list of users available on the BI portal, with an indicator showing which users are designated analytic support people and which users have at least had ad hoc tool training. Include a simple report showing query activity by user in the last month or so, sorted from most to least active.

Additional Functions

The documentation described previously is critical for users to learn about the DW/BI system. Beyond the documentation itself, there are a few tools that make the documentation much more accessible to the users. Primarily, these are a metadata browser, a search function, and warehouse activity monitoring.

  • Metadata browser: The metadata browser is the reporting and navigation front end for the metadata repository. Microsoft has created standard reports to display and navigate the metadata from several of their main tools, including Integration Services and Reporting Services. Installing these and making them available to your users through a single navigation page is a strong first step in metadata browsing. However, this does not include the basic metadata that describes the contents of the DW/BI system: the business process dimensional models that people will see in the query tools. Create a few Reporting Services reports that allow users to browse this metadata as well. As we describe in Chapter 13, the metadata schema is easy to navigate and can be accessed with a few simple reports.

  • Search function: The ability to search the contents of the warehouse, and especially the report descriptions, is mandatory in today’s “Find” function search-driven world. This presents a bit of a challenge for us architecturally because some of the information users want to search is on the DW/BI web site, some of it is in database tables, and some might even be in XML documents scattered about the system.

  • Warehouse activity monitors: Power users and the DW/BI team always want to know what’s going on in the DW/BI system right now. You might hear this question in slightly different forms, like “Who’s on the system?” or “Why is the report so slow?” It’s fairly easy to develop a few reports that execute against the SQL Server system tables or Analysis Services database. You can even set up these reports to drill down into the individual user sessions and examine the underlying query. We’ve been surprised at how well some of our stronger users have been able to understand the issue of query workload by using these reports. Often, they take matters into their own hands when resources are constrained and call the offending user directly. We discuss activity monitoring in greater detail in Chapter 15.

Previous Section
Next Section

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows