Import options for term bases |
You can import term base contents from two file types:
· | CSV or tab-delimited text, or |
· | TMX (Translation Memory Exchange). |
Import options for CSV and tab-delimited text files
In order to correctly import the contents of a CSV or text file into a term base, you need to set several options so that memoQ has appropriate information about the format and the structure of the text file. In principle, you need to set four aspects of the import: Basically, you need to tell memoQ what the file looks like because the program itself cannot guess it.
Important: For a detailed description of each import option, see the Term base CSV import settings dialog topic.
· | Encoding: Specifies the encoding that is used in the CSV or text file you are importing. You need to specify exactly the same encoding as in the file. memoQ’s default setting is UTF-8, a flavour of Unicode, which is suitable for most imports. However, you can choose from plenty of other encodings if this is not suitable. While it is useful to know about the various encodings, you can manage without this knowledge: memoQ displays a preview of the file at the bottom of the Term base CSV import settings dialog where you can check if all characters are imported correctly. |
· | Delimiter: Specifies the delimiter character used to separate the table cells in the CSV file you are importing. You need to specify exactly the same delimiter character as used in the imported file. The default is the comma, but in several cases, the CSV or tab-separated text files are saved with the tab character as the delimiter. |
· | Header row: Some CSV or tab-separated text files include a header row in the first line of the file. You can tell memoQ if your import file has this row – from this, memoQ will know how each table column is labelled. |
· | Field mapping: In memoQ, a term base entry has a fixed set of properties. These properties also have fixed names. On the other hand, the column headers in the CSV file can be different, or they can be missing if there is no header row in the file. In either case, you need to tell memoQ which of the CSV columns correspond to each property of a term base entry. For example, memoQ uses the English names of languages to label term fields – where the actual terms are stored. If you receive a CSV file where the column headings are in Spanish, your file will look like this: |
Inglés,Español
translation,traducción
translation memory,memoría de traducción
With this CSV file, you need to tell memoQ that the column called Inglés contains terms in English, and the column called Español has terms in Spanish.
If, however, the column headings match the internal property labels used by memoQ — for example, if the term columns are labelled with the English name of the language –, memoQ will recognize the match and automatically map the column in the CSV file to the corresponding property of the term base entry. If the above example looks like this:
English,Spanish
translation,traducción
translation memory,memoría de traducción
memoQ will automatically recognize that both columns contain terms, one in English, and the other in Spanish.
Note: If the CSV file you are importing was exported from another copy of memoQ, using the default export settings, you do not need to adjust the import options to import the CSV file correctly.
Importing an Excel spreadsheet into a memoQ term base
When you have a glossary in the form of a Microsoft Excel workbook, here is what you can do to import it into a memoQ term base:
1. Make sure that your table has a header row, and the term columns are labelled with the English name of each language.
2. Save the workbook (more precisely, the current worksheet) in the Unicode text format. This format will be available in the Files of type drop-down list of the Save as dialog in Excel. The result will be a tab-separated text file in the UTF-8 encoding, which is perfectly suitable for importing into memoQ. The file will have the .txt extension.
Note: For detailed instructions on performing the above steps, refer to Microsoft Excel help.
3. When you import this file into a memoQ term base, make sure that UTF-8 is selected as the encoding; select the tab character as the delimiter, and indicate that your file
has a header row.
Tip: If you have a glossary in the form of a table in a Word document or RTF file, select the entire table in Word (or in the word processor you are using), and simply paste it into a Microsoft Excel worksheet. Then follow the above steps to create the text file to import into memoQ.
Import options for TMX files
memoQ allows for importing the contents of TMX files into term bases. While TMX files are traditionally used to store translation memory contents, some translation memories contain shorter phrases that often occur in a textual context. Such phrases are software commands or messages, which get into a translation memory when the user interface of a software application is localized.
Afterwards, when the documentation of the same software application is translated, it is important to refer to each command and message as it is displayed in the program. The handiest tool for this is a term base because
a. | memoQ highlights phrases in the source text when they are found in a term base, |
b. | memoQ’s QA module can check if the translations of the phrases (in case of software localization, messages and commands) are correctly and consistently applied. |
When you select a TMX file to import into a memoQ term base, you need to set the following options:
· | Field mapping: TMX files usually contain source-language and target-language strings. You need to map the languages in the TMX file to the languages in the term base. |
· | You can also fill in some metadata properties such as the domain, subject, author and date/time information in each term base entry. You can instruct memoQ to take this information from the TMX file where it is present, or you can specify default values. |
· | You can also limit the number of entries that will be imported. This might be important because a TMX file can contain several hundreds of thousands or even millions of entries, and this amount of entries might be inconvenient to manage in a single memoQ term base. memoQ’s term base engine was tested for up to one million entries per term base. |
Important: For a detailed description of each import option, see the Term base TMX import settings dialog topic.