Unlike more specialized document formats like the Microsoft Office formats, each of which accommodate one specific type of document, XML (Extensible Markup Format) is a general standard for storing any kind of information. Making use of the extensible nature of XML, authors of XML documents are free to create new “breeds” of XML (document types) that store information in the structure of the author’s choice. This allows a potentially infinite number of very different XML document types, making it impossible to treat all XML documents automatically: each type needs to be processed differently for translation. Some of the file formats handled by memoQ (including InDesign INX) are actually XML document types themselves. The exact XML document type is normally described as a set of rules called a schema, a common form of which being a DTD file. A DTD file contains the complete description of a specific document type in the sense that it lays out all the possible elements that documents of the specific type may contain, as well as the possible structures these elements can be ordered into.
To prepare an XML document for translation, translatable textual content must be separated from structure and non-translatable content. This process is carried out by memoQ each time an XML document is imported for translation, following the rules set out in the XML format configuration for the document type. To import a new type of XML document, the user must obtain or create a format configuration. DTD files or sample documents can assist the user in this.
memoQ provides a customizable XML filter that can be used to import practically any XML document for translation. For new XML document types, a format configuration must be defined.