Statistics

Using the Statistics dialog, you can calculate counts and analysis on the active project any time. Although the dialog is always the same, the preset values depend on the location from which you open the dialog.

 
MemoQ statistics Statistics

Invoking

You can invoke the Statistics dialog by choosing Statistics in the Translations pane of Project home, the Translation documents dialog of the New memoQ project wizard or the Operations menu, or by clicking the MemoQ icon statistics1 Statistics icon in the toolbar.

Options

Select scope area:

· Project: Select this radio button to analyse the whole project.
· Selected documents: Select this radio button to analyse only the documents you selected in the Translations pane of Project home.
· Open documents: Select this radio button to analyse only those documents that are currently open.
· Active document: Select this radio button to analyse only the active document.
· From cursor: Select this radio button to analyse only the active document downwards from the cursor. This option is only available if you start the dialog with a translation document open and active in memoQ.
· Selection: Select this radio button to analyse only the selected segments of the active document. This option is only available if you start the dialog with a translation document open and active in memoQ.
· Work on views: Select this checkbox to process the views too.
· Show results for each file: Check this check box to get detailed information on every single file, not only aggregate information.
 

Counts area:

· Show counts: Check this check box to display the number of source segments, source words and source characters and the source-wordcount based percentage for all translatable segments, repetition and, if the check box labeled Include locked rows is checked, all locked segments within the given scope.

Note: If the check box labeled Show counts is checked and the check box labeled Status report is unchecked, you will also see the number of source segments, source words and source characters and the source-wordcount based percentage for segments not started, segments pre-translated, segments where fragment search yielded a result, segments edited, and segments confirmed within the given scope.

· Include target counts: Check this check box to show the number of source segments, source words and source characters and the source-wordcount based percentage for the target side too.
· Status report: Check this check box to also display the number of source segments, source words and source characters and the source-wordcount based percentage for segments not started, segments pre-translated, segments where fragment search yielded a result, segments edited, and segments confirmed within the given scope.

Note: If the check box labeled Show counts is checked and the check box labeled Status report is unchecked, you will only see the number of source segments, source words and source characters and the source-wordcount based percentage for all translatable segments, repetition and, if the check box labeled Include locked rows is checked, all locked segments within the given scope.

Word counts area:

· memoQ: Check this check box to display memoQ word counts.

Note: In memoQ, similarly to Microsoft® Excel®, every string or character that is between whitespaces is counted as a word. Therefore in memoQ mode you always count numbers as a single word and hyphenated words like in-bound are also considered to be a single word.

· TRADOS-like: Check this check box to display Trados-like word counts. SDL Trados® is another CAT tool on the market that handles word counts differently.

Note: In Trados, numbers are only counted as words when they are within a segment, and a number of other rules apply. In Trados®, segmentation is a factor in word count, i.e. you can get a different word count if the same text appears in one or two lines. Trados® segmentation rules are not public, therefore there is usually a small discrepancy between the word counts of Trados® and Trados-mode memoQ. In most of the cases, this discrepancy does not exceed 1.5%. We suggest that you only use Trados-like word counts if your client explicitly requires you to do so.

Analysis area:

· Project TMs and corpora: Check this check box to compare the text against the translation memories and LiveDocs corpora of the project.
· Details by source: Check this check box to see the contribution of each translation memory or LiveDocs corpus of the project to the translation.

Note: Checking this check box can produce very long lists of statistical data.

· Homogeneity: Analysis against the segments within the selected scope is called homogeneity analysis. This is one of memoQ’s power features. Check this check box to emulate building a translation memory during translation, and see the savings that will result from the internal similarities within the project. Using homogeneity, you are able to see the benefits of your future contribution – i.e. the contribution while you will be translating – to the translation memory. You are also able to give a much better estimation of your resources to be spent on translation than without homogeneity. If you use the analysis to give a quotation, always look for the aggregate results as they reflect the real productivity gain through using memoQ.
· Create project TM: Check this check box if you want memoQ to collect all segments from the translation memories and LiveDocs corpora that give a result during translation, no matter the quality. Project TMs are subsets of large TMs that are relevant to translation projects.
· Include locked rows: Check this check box to display the number of source segments, source words and source characters and the source-wordcount based percentage for the locked segments within the given scope, if any.
· Include spaces in the character count: memoQ can count characters either with or without white space. Tick this check box if you want to see the character count with spaces; untick it if you want to see it without the spaces.

 
Buttons:

· Calculate: Click this button to analyse the texts with the specified settings and once the analysis is done, show the statistics in the results pane. Depending on the number and size of resources, the calculation of statistics can take several minutes.
· Export: Click this button to invoke the Export statistics result dialog where you can save the result into two common file formats, HTML or Comma-Separated Values to load it into other application like Microsoft® Excel® or send it to your client.

The Export statistics result dialog

MemoQ export statistics results Statistics

· HTML (Reflecting displayed results): Select this radio button to save the displayed statistics as HTML file.
· CSV (Reflecting displayed results): Select this radio button to save it as CSV file.
· CSV (Per-file, TRADOS-compatible): Select this radio button to save the statistics as CSV file, containing the relevant data in a one file per row layout in Trados® logic.
· CSV (Per-file, All information): Select this radio button to save the statistics as CSV file, containing the relevant data in a one file per row layout in memoQ logic.

 

Interpreting the results

The output of statistics consist of a Counts area and one or more Analysis areas, depending on the number of translation memories registered in your project and the Show results for each file or the Details by TM option.

Scope: This field indicates the scope of the analysis selected in the Select scope area.

Resources: This field indicates the resources against which the results were gained. Here you find the name of a translation memory or homogeneity for homogeneity checks. If it’s an aggregate results table, you see the caption Every TM or Every TM, Homogeneity.

Type column

· All: This row indicates the number of all source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Repetition: This row indicates the number of repeated source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
 

Note: memoQ’s statistics engine always calculates repetitions within the selected scope. I.e., if you have two documents in your project, both of which contain an instance of the same segment only once, the statistics calculated for the project scope will show 1 segment as repetition. If you calculate statistics separately for the two documents, neither will include this segment in the repetitions count.

This difference may be significant if you plan to split a large project between different translators, because the overall statistics for the complete project may show a considerably higher rate of repetitions that the different subsets of documents in themselves.

· Not started: This row indicates the number of not yet edited source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Pre-translated: This row indicates the number of pre-translated source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Fragments: This row indicates the number of source segments, source words and source characters and the source-wordcount based percentage within the selected scope, where the translation of a long source segment was put together from translations of its smaller parts, especially during pre-translation.
· Edited: This row indicates the number of edited source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Confirmed: This row indicates the number of confirmed source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Proofread: This row indicates the number of proofread source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Locked: This row indicates the number of locked source segments, source words and source characters and the source-wordcount based percentage within the selected scope.
· Percentage ranges: These rows indicate the number of source segments, source words and source characters and the source-wordcount based percentage, within the selected scope, for which there is a translation memory match falling within the percentage range mentioned here arising from the use of the above-mentioned resource.

Note: E.g.: If you see 5 after 75-84%, when the resource is Every TM, and the scope is the Project, it means that if you use the combination of all translation memories in the project, you will get a 75-84% hit for five segments in the project.

Segments column: This column indicates the number of source segments, specified by the Type column, within the selected scope.

Source words column: This column indicates the number of source words, specified by the Type column, within the selected scope.

Source chars column: This column indicates the number of source characters, specified by the Type column, within the selected scope. Character counts include whitespace but do not include uninterpreted formatting tags.

Percent column: This column indicates the percentage between the number of source words and the number of all source words, specified by the Type column, within the selected scope. The sum of all percentages may not be precisely 100% because of rounding margins.

Target words column: This column indicates the number of target words, specified by the Type column, within the selected scope. This column appears only if the check box labeled Include target counts is checked.

Target chars column: This column indicates the number of target characters, specified by the Type column, within the selected scope. This column appears only if the check box labeled Include target counts is checked.

Navigation

Click Close to leave the dialog.

Statistics