Removing Hidden Data and Personal Information from Microsoft Office Documents






Removing Hidden Data and Personal Information from Microsoft Office Documents

Information that you mean to keep to yourself-or at least within your company’s network or among a group reviewing a document-can appear in a file in places that you might not check before you distribute a document, post it on a Web site, or attach it to an e-mail message that you send to a customer, a partner, or your executive team. This information can reveal details about your organization or about the document itself that you might not want to share. So, just as you proofread and review a document for accuracy before you share it, you need to take time to review your documents for hidden data or personal information that might be stored in the document itself or in the document’s properties.

This section explains how to use the Document Inspector, a feature in Word, Excel, and PowerPoint that can help you find and remove hidden data and personal information. The Document Inspector uses different modules to let you find and remove hidden data and personal information that might be present in a file you create with one of these programs. When you run the Document Inspector, you can select one or more of the program’s modules and check whether any information of that type is hidden or present in the file. Figure–2 lists the Document Inspector modules for each program.

Figure–2: The Document Inspector Modules
Open table as spreadsheet

Word

Excel

PowerPoint

Comments, Revisions, Versions, and Annotations

Comments and Annotations

Comments and Annotations

Document Properties and Personal Information

Document Properties and Personal Information

Document Properties and Personal Information

CustomXMLData

CustomXMLData

Custom XML Data

Headers, Footers, and Watermarks

Headers and Footers

Invisible On-Slide Content

Hidden Text

Hidden Rows and Columns

Off-Slide Content

 

Hidden Worksheets

Presentation Notes

 

Invisible Content

 

The following list describes in more detail the types of hidden data and personal information that the Document Inspector can find and remove:

  • Comments, revision marks from tracked changes, versions, and ink annotations   Items such as revision marks from tracked changes, comments, ink annotations, or document versions can let other people see the names of people who worked on your document, comments from reviewers, and changes that were made to your document. This is the type of information you might not want to circulate beyond the group of people you worked with on a document.

  • Document properties and personal information   Document properties (sometimes known as metadata) provide details about your document. Properties list details such as the document’s author, its subject, and its title. Document properties also reveal information such as the name of the person who saved the document most recently, which is information that Microsoft Office maintains about the document. A document might also contain information that can be used to identify you or someone who worked on the document-your name, address, e-mail address, and the like. E-mail messages might include this information in the message header. A program such as Word could include it in a routing slip, a printer location (path), or in the file path used for publishing Web pages.

  • Headers, footers, and watermarks   Word documents and Excel workbooks can contain information in headers and footers. In addition, you might have added a watermark to your Word document.

  • Hidden text   Word documents can contain text that is formatted as hidden text. Hidden text is sometimes used for instructions or other types of comments intended to guide a user in preparing a document. This information should likely be removed before a document is circulated.

  • Hidden rows, columns, and worksheets   In an Excel workbook, rows, columns, and entire worksheets can be hidden. These areas of a worksheet are often used to maintain formulas, for example, some of which might be sensitive and proprietary. A good example would be a discount schedule that your company applies to orders of a certain dollar volume or quantity. If you distribute a copy of a work-book that contains hidden rows, columns, or worksheets, other people might display these rows, columns, or worksheets and view the data that they contain.

  • Invisible content   PowerPoint presentations and Excel workbooks can contain objects that are not visible because they are formatted as such. These objects may have been made invisible for good reason.

  • Off-slide content   PowerPoint presentations can contain objects that are not immediately visible because they were dragged off the slide. This content might include text boxes, clip art, graphics, and tables that were once considered for a presentation but are not part of the finished product.

  • Presentation notes   The Notes section of a PowerPoint presentation can contain text that you might not want to share. Often, these notes are written by or for the individual who is making the presentation and are not considered part of the presentation itself.

  • Document server properties   If your document was saved to a location on a document management server, such as a Document Workspace site or a Windows SharePoint Services document library, the document might contain additional document properties or information related to this server location.

  • Custom XML data   Documents can contain custom XML data that is not visible in the document itself. The Document Inspector can find and remove this XML data.

Note 

Developers can create customized Document Inspector modules to check for other hidden information in a document. If your organization has customized the Document Inspector by adding modules, you might be able to check your documents for additional types of information.

The Document Inspector does its work in three steps. First, you select which modules you want to use. You might want to check for comments and revisions, but you know the document wasn’t created with XML, so you don’t need to choose that module. After the inspection, you see a report indicating whether and what the Document Inspector found. You then get to choose which information you want to remove from the document. It is a good idea to inspect a copy of the document you’re working on rather than the original, because you cannot always restore the information that the Document Inspector removes. After you’ve made a copy of the document, open the document and follow these steps:

  1. Click the Microsoft Office Button, point to Prepare, and then click Inspect Document.

  2. In the Document Inspector dialog box, shown here for PowerPoint, select the options for the types of hidden and personal content that you want to check for.

    Image from book
  3. Click Inspect.

  4. Review the results of the inspection in the Document Inspector dialog box, shown next.

    Image from book
  5. Click Remove All next to the items for the types of hidden content that you want to remove from your document.



 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows