Getting Your Schema in Bulk from Existing XML Files

Getting Your Schema in Bulk from Existing XML Files


You have come on to a new project in which XML was used for data transmission, but the programmers who came before you didn't use an XSD for one reason or another. You need to generate beginning schema files for each of the XML examples.


Use the XmlSchemaInference class to infer schema from the XML samples. The GenerateSchemaForDirectory function in Figure enumerates all of the XML files in a given directory and processes each of them using the XmlSchemaInference.InferSchema method. Once the schemas have been determined, it rolls over the collection and saves out each schema to an XSD file using a FileStream.

Generating an XML Schema

public static void GenerateSchemaForDirectory(string dir)
    // Make sure the directory exists. 
    if (Directory.Exists(dir)) 
        // Get the files in the directory.
        string[] files = Directory.GetFiles(dir, "*.xml");
        foreach (string file in files)
            // Set up a reader for the file.
            using (XmlReader reader = XmlReader.Create(file))
                XmlSchemaSet schemaSet = new XmlSchemaSet(); 
                XmlSchemaInference schemaInference = 
                                new XmlSchemaInference();

                // Get the schema.
                schemaSet = schemaInference.InferSchema(reader);

                string schemaPath = "";
                foreach (XmlSchema schema in schemaSet.Schemas())
                    // Make schema file path. 
                    schemaPath = Path.GetDirectoryName(file) + @"\" + 
                                    Path.GetFileNameWithoutExtension(file) + ".xsd"; 
                    using (FileStream fs =
                        new FileStream(schemaPath, FileMode.OpenOrCreate)) 

The GenerateSchemaForDirectory method can be called like this:

	// Get the directory two levels up from where we are running.
	DirectoryInfo di = new DirectoryInfo(@"..\..");
	string dir = di.FullName;
	// Generate the schema.


Having an XSD for the XML files in an application allows for a number of things:

  1. Validation of XML presented to the system

  2. Documentation of the semantics of the data

  3. Programmatic discovery of the data structure through XML reading methods

Using the GenerateSchemaForDirectory method can jump-start the process of developing schema for your XML, but each schema should be reviewed by the team member responsible for producing the XML. This will help to ensure that the rules as stated in the schema are correct and also to make sure that additional items like schema default values and other relationships are added. Any relationships that were not present in the example XML files would be missed by the schema generator.

See Also

See the "XmlSchemaInference Class" and "XML Schemas (XSD) Reference" topics in the MSDN documentation.

 Python   SQL   Java   php   Perl 
 game development   web development   internet   *nix   graphics   hardware 
 telecommunications   C++ 
 Flash   Active Directory   Windows