ISO/DIS 18968:2025(en)
ISO TC 37/ WG 12
Secretariat: SAC
Date: 2025-07-23
Translation-oriented writing — Text production and text evaluation
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
Contents
4 Recommendations for specialized content intended for translation 3
4.2.2 Formatting in terms of text flow and segmentation 4
4.2.3 Formatting graphic layout of texts 16
4.2.4 Formatting text content and references 19
4.3.3 Uncommon or unknown abbreviations 22
4.3.4 Orthographic variants 23
4.3.6 Compound terms and constructions 25
4.3.7 Assuring correct use of terminology 26
4.4 Grammar, syntax and style 26
4.4.3 Word choice and word formation 29
4.4.4 Unambiguous references 32
4.4.5 Style and reader engagement 34
4.4.6 Gender-sensitive language 36
4.5 Presentation of content 36
5 Recommendations for the handover from text production to translation 40
5.4 Contextual information, reference material and terminology aids 42
5.5 Locales, document templates and styles 43
5.6 Lists, indices and glossaries 43
5.7 Reviewing source language content prior to providing it to the translation service provider 44
6 Translation-oriented texts: Evaluation 44
Foreword
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISO’s adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology.
Any feedback or questions on this document should be directed to the user’s national standards body. A complete listing of these bodies can be found at www.iso.org/members.html.
Introduction
0.1 Overview
Because goods and services today are often marketed worldwide, specialized content must be available in the languages used in each market. To this end, content is translated by means of human translation (with or without computer-aided translation (CAT) tools) or machine translation (MT, with or without subsequent editing by a human translator, a process known as post-editing). In this context, translation-oriented writing can cut down on the number of queries from translators, reduce costs, avoid translation errors, and realize shorter delivery times, especially when translation technology is used.
Technical documentation in particular (e.g. product manuals, online help, safety data sheets or assembly instructions) as a form of specialized content is intended to ensure that products are used safely, efficiently and effectively. Standards such as IEC/IEEE 82079‑1 define the requirements for technical documentation. In-house style guides supplement these requirements by codifying enterprise-specific writing conventions.
This document provides recommendations for authoring or editing specialized content in the source language intended for translation into one or more target languages and for the evaluation of translation-oriented texts. Some clauses can, however, also be useful to non-specialized texts intended for translation.
Attention is drawn to the possibility that some of the elements of this document may be subject to patent rights other than those in the patent database. ISO shall not be held responsible for identifying any or all such patent rights.
0.2 Notations
This document uses structured examples to illustrate textual features to avoid because they can cause problems in translation. The examples in this document have been chosen for illustrative purposes and may not be equally applicable to every target language. Translation of the standard into other languages can necessitate the selection of other examples to illustrate the issues discussed.
The examples are each laid out and formatted to best support the recommendations in question. Due to the variety of issues covered, the layout varies throughout the document. However, an attempt has been made to use as consistent a layout as possible.
— Every example contains at least source language content and an explanation. The explanation points out any issues with the source language content and their possible effect on translation. Target language examples are provided only when they provide added value for the explanation and understanding of the issue at hand. The explanation gives information on errors in the target language content that would otherwise not be understood due to readers’ lack of proficiency in the target language.
— Thick lines are used to separate information types.
— The symbol (+) is used for positive examples and the symbol (–) is used for negative examples.
— Numbers in parentheses such as (1), (2), (3) indicate possible variants of target language content.
— Characters in parentheses are only used to specify the example type (positive, negative, variant). They do not represent a part of the source or target language content displayed.
— No specific symbol is used for neutral examples.
— The source language of examples is always English; therefore, the source language is not indicated.
— The target languages of examples vary, but are mainly German, French and Italian. The target language is indicated in square brackets in the column or line marking the target language content. Italian target language content, for example, is marked with [it] in accordance with ISO 639.
EXAMPLE 1
Source language content | Target language content [it] |
(–) A negative example in English to highlight source language content issues. | (–) A negative example (in Italian) to highlight possible translation errors resulting from the issues in the source language content. |
(+) A positive example in English of translation-oriented writing avoiding the issues of the negative example. | (+) A positive example (in Italian) containing no translation error. |
Explanation: An explanation pointing out the issues with the source language content and their possible effect on translation. | |
EXAMPLE 2
Source language content | Target language content [fr] |
(–) A negative example in English to highlight source language content issues. | (1) One possible translation (in French) to highlight possible translation variants. (2) Another possible translation (in French) to highlight possible translation variants. |
(+) A positive example in English of translation-oriented writing avoiding the issues of the negative example. | (+) A positive example (in French), containing unambiguous target language content. |
Explanation: An explanation pointing out the issues with the source language content and their possible effect on translation. | |
Examples in clause 4.2 are more complex than other examples, since they display the effects of formatting in the text editor on translation with CAT tools. In addition, segments can be displayed for the target language content if segmentation is relevant.
EXAMPLE 3
Source language content in text editor | (–) A negative example in English to highlight source language content formatting errors as they would appear in a text editor. (+) A positive example in English, containing no formatting errors. |
|---|---|
Source language content in TM system | Source language content examples as they would appear in a TM system during translation. (–) A negative example to highlight the issues CAT tools have with incorrect formatting. (+) A positive example to show how correct formatting avoids negative effects on translation using CAT tools. |
Target language content in TM system [de] | (–) A negative example (in German) to highlight possible translation errors in the target language content in the TM system resulting from the formatting errors in the source language content. (+) A positive example (in German) of correct target language content in the TM system resulting from correctly formatted source language content. |
Target language content in text editor [de] | (–) A negative example (in German) to highlight possible translation errors in the target language content in the text editor resulting from the formatting errors in the source language content. (+) A positive example (in German) of correct target language content in the text editor resulting from correctly formatted source language content. |
Explanation | An explanation pointing out the formatting issues with the source language content and their possible effect on translation with CAT tools. |
Translation-oriented writing — Text production and text evaluation
1.0 Scope
This document provides recommendations for authoring, editing and evaluating specialized source language content intended for translation. In addition, this document addresses handover recommendations in connection with the production and translation of specialized content. This document is meant as a practical guidance and recognizes that not all clauses are equally applicable to every use case.
This document is intended for authors and editors of specialized content intended for translation. It also enables translators and translation service providers to assess the suitability of specialized content for translation. This document can also be used by tool providers, for example to develop and improve automatic source language testing and verification procedures.
This document does not provide recommendations for authoring, editing and evaluating fictional, journalistic, advertising and other non-specialized content.
2.0 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 20539, Translation, interpreting and related technology — Vocabulary
3.0 Terms and definitions
For the purposes of this document, the terms and definitions given in ISO 20539 (https://www.iso.org/obp/ui/en/#iso:std:iso:20539) and the following apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
3.1
automatic line break
software feature that automatically adapts the line break of digital texts (3.13) to the layout and creates an approximately uniform line length throughout the text
3.2
exchange format
machine-readable format for representing information that is intended to facilitate exchange of the information between different applications
[SOURCE: ISO 25964-1:2011, 2.19; modified: Note 1 to entry deleted]
3.3
match value
match, expressed as a percentage, between a segment in the source text (3.11) and the corresponding segment found in a translation memory or other sources
3.4
hard break
hard line break
hard return
keyboard shortcut interpreted by text editors (3.14) as the end of a paragraph
3.5
language checker
language checking tool
software that supports users in creating orthographically and grammatically correct, consistent, comprehensible and corporate-language-compliant content
3.6
language identifier
language symbol
string of characters assigned to an individual language or a language group for the purpose of identifying it unequivocally
Note 1 to entry: In the ISO 639 language code, the string of characters consists of a string of letters.
EXAMPLE: |
The individual language “Dutch” is assigned the two-letter language identifier “nl”, a three-letter identifier “nld” for use in the field of terminology and other language applications, and another three-letter identifier “dut” for use in the field of librarianship and documentation. The individual language “Polish” is assigned the two-letter language identifier “pl” and the three-letter identifier “pol”. The language group “Khoisan languages” is assigned the three-letter language identifier “khi”. |
[SOURCE: ISO 639:2023, 3.7.10; modified: Note 2 to entry deleted]
3.7
segment end
segment boundary
the punctuation or other content used to identify the end of a segment
3.8
segment pair
translation unit
TU
segment of source language content matched with its corresponding translated content
3.9
segmentation
process of splitting a text into segments
3.10
soft break
soft return
keyboard shortcut which forces a new line, but neither interrupts the paragraph formatting nor the segmentation (3.9)
3.11
source text
text (3.13) to be translated into one or more target languages
3.12
target text
translated text (3.13) written in the intended target language
3.13
text
content in written form
[SOURCE: ISO 17100:2015, 2.3.4]
3.14
text editor
software that enables a user to create and revise text (3.13)
[SOURCE: ISO/IEC 2382:2015, 2126196; modified: Notes 1 and 2 to entry deleted]
3.15
termbase
terminology database
terminological database
database comprising a terminological data collection
[SOURCE: ISO 26162-3:2023, 3.11]
3.16
TM system
translation memory system
CAT tool that uses a translation memory
3.17
translation-oriented writing
writing for translation
text (3.13) production resulting in content that lends itself to translation
3.18
translation project specifications
set of agreed upon and defined requirements for producing translation output
Note 1 to entry: ISO 17100:2015, Annex B, lists a set of sample project specifications.
Note 2 to entry: For detailed information on developing project specifications, see ISO 11669:2024 Clauses 4 and 5 and Annexes B and C.
[SOURCE: ISO 5060:2024, 3.3.2; modified: Note 2 to entry added]
3.19
XLIFF
XML Localization Interchange File Format
XML file format for the interchange of translation data. localization data, or both
Note 1 to entry: For more information on XLIFF, see ISO 21720
4.0 Recommendations for specialized content intended for translation
4.1 General
Specialized content must fulfil its intended communicative function by having a specific purpose and addressing one or more target audiences.
Specialized content is characterized by certain features such as:
— credibility and relevance;
— appropriateness and correctness;
— formal grammatical cohesion and functional semantic coherence of meaning to ensure that the content is comprehensible in a given language;
— correct, consistent and domain-appropriate terminology;
— due consideration of the specific communicative situation between author and target audience;
— relationship with other specialized content;
— clear document structure.
Additional consideration should be given to any specialized content intended for translation. On the formal side, the fact that specialized content will be processed using a computer is critical. Prior to translation, specialized source language content is usually imported into translation memory systems (TM systems). See 4.2.1 for additional detail. On the content side, both country-specific and culture-specific elements should be avoided, as these kinds of elements limit the international applicability of specialized content. See 4.2 to 4.5 for additional detail.
4.1.1 Formatting
4.1.2 General
Most specialized source language content is translated using CAT tools. TM systems divide texts into individual segments, matching source and target segment pairs and storing them for future retrieval in a translation memory (TM). In order to prevent incorrect segmentation or the need for manual interference in the TM system’s pre-set segmentation, such as merging and splitting of segments, it is imperative that texts be formatted correctly. For example, incorrectly placed line breaks and the like can result in incorrect segmentation of the source text. Incorrect segmentation of the source text will cause the formation of inappropriate segment pairs in the TM and will therefore render them unusable for future translation. If such inappropriate segment pairs are accepted without validation, they can even cause translation errors. To avoid unpredictable and irregular formatting results, the built-in text editor formatting and style functions should be used as opposed to manual formatting. Failure to use built-in style features can result in unnecessary corrective work, which in turn demands extra time and effort, and entails extra cost, cutting into overall translation process efficiency.
The text editor’s built-in setting should be set to display paragraph marks and formatting symbols in order to ensure that the formatting is visible. When these marks and symbols are visible, unnecessary, incorrect or inconsistent formatting can be identified and removed prior to translation.
When a source text is drafted, it should also be borne in mind that the text can expand or contract in length when translated into other languages and that extra space should therefore be included in the layout, where necessary. Because of the expansion and contraction factor, breaks, tabs or other manual interventions intended to optimise the source-language layout can also be rendered in completely different positions in the target language.
4.1.3 Formatting in terms of text flow and segmentation
Hard breaks
Recommendation: Hard breaks should not be manually inserted within sentences.
Breaks are used to influence where the line ends. Software applications are designed to automatically break the line as close to the margin as possible at or toward the end of the defined line. Line breaks can include a blank space (e.g. space character), an existing hyphen, manually predetermined hyphenation (soft hyphen, zero-width space), or predetermined, automatic hyphenation.
When automatic line breaks are used, the text is displayed across several lines in the text editor, but will not be split into several segments by the TM system. However, manually inserted breaks, especially hard breaks, can be interpreted by segmentation utilities as segment breaks, which has a negative impact on segmentation in CAT tools and can result in translation errors.
A “hard break” or “hard return” (symbol: ¶) is an end-of-line marker interpreted by text editors as the end of a paragraph.
A hard break always marks a segment boundary for automatic processes. Inappropriately placed hard breaks cause incorrect segment pairs to be formed in the text editor and unnecessary corrective work, such as merging segments manually. Especially where the sentence structure in the target language requires a different word order than in the source language, a hard break can cause the source language and target language content to misalign if the segmentation is not corrected manually.
EXAMPLE 1
Source language content in text editor | (–) Large distances and thick walls reduce the coverage of the radio¶ signal. | ||
|---|---|---|---|
Source language content in TM system | Segment 1 | Large distances and thick walls reduce the coverage of the radio | |
Segment 2 | signal. | ||
Target language content in TM system [de] | Segment 1 | Große Entfernungen und dicke Wände reduzieren die Reichweite des | |
Segment 2 | Funksignals. | ||
Source language content in text editor | (+) Large distances and thick walls reduce the coverage of the radio signal. | ||
Source language content in TM system | Segment 1 | Large distances and thick walls reduce the coverage of the radio signal. | |
Target language content in TM system [de] | Segment 1 | Große Entfernungen und dicke Wände reduzieren die Reichweite des Funksignals. | |
Explanation | The two segments created by the hard break contain two incomplete text fragments: The first segment is lacking an end of sentence character, and the second segment starts with a lowercase letter and contains no verb. | ||
|
| — This makes it more difficult (for humans and computers) to interpret the source language content as one message in one sentence. — This suggests an inappropriate segmentation for CAT tools. — This divides the multi-word term “radio signal” into two separate components. This division can prevent language technology processes from working correctly, such as machine translation or terminology look-up during translation, term extraction or named entity recognition. | |
| The sentence without hard breaks conveys one message, produces no segmentation errors and displays the multi-word term correctly. | ||
EXAMPLE 2
Source language content in text editor | (–) Switch off the warning tone using¶ the alarm button. | ||
|---|---|---|---|
Source language content in TM system | Segment 1 | Switch off the warning tone using | |
Segment 2 | the alarm button. | ||
Target language content in TM system [de] |
| Variant (1) | Variant (2) |
Segment 1 | Den Warnton ausschalten unter Verwendung | Den Warnton mit | |
Segment 2 | der Alarmtaste. | der Alarmtaste ausschalten. | |
Explanation | In this case, the translator has no good option to translate the source text split into two segments into the target language. Variant (1): Variant (2): | ||
EXAMPLE 3
Source language content in text editor | (–) Switch off the warning tone using¶ the alarm button. | After 10 seconds the display is switched off.¶ | |
|---|---|---|---|
Source language content in TM system | Segment 1 | Switch off the warning tone using | |
Segment 2 | After 10 seconds the display is switched off. | ||
Segment 3 | the alarm button. | ||
Explanation | The order of segments in the TM system is not always the same as in the text editor. Finding the matching segments takes time and is a potential source of errors. | ||
Soft breaks
Recommendations:
— Soft breaks should be avoided in the middle of a sentence if they serve no purpose other than for layout.
— Soft breaks should be avoided at the end of a sentence.
A “soft break” or “soft return” (symbol: ) forces a new line, but does not interrupt paragraph formatting or segmentation. Soft breaks are therefore preferable to a hard break within a sentence.
However, the soft break is also stored in the TM and thus reduces the match value of the segment. The probability that the sentence will occur in exactly the same way, i.e. with the line break in exactly the same place in another text, is low, and an adjustment by the translator will therefore be necessary.
Because the target text and source text can vary in length, it is often necessary to remove or move the soft breaks in the target language content to keep the layout as it is in the source language content.
EXAMPLE 1
Since soft breaks do not interrupt segmentation, they should not be used to separate entire source language content into different lines. Instead, a hard break should be used.
EXAMPLE 2
Source language content in text editor | (–) E-mail: info@example.com | |
|---|---|---|
Source language content in TM system | Segment 1 | E-mail: |
Segment 2 | info@example.com | |
Segment 3 | Monday to Friday from 8:00 a.m. to 6:30 p.m. | |
Source language content in text editor | (+) E-mail: info@example.com¶ | |
Source language content in TM system | Segment 1 | E-mail: |
Segment 2 | info@example.com | |
Segment 3 | Office hours: | |
Segment 4 | Monday to Friday from 8:00 a.m. to 6:30 p.m. | |
Explanation | In the negative example, the soft return after the e-mail address causes parts of the two lines to appear in one segment because there is no segment end marker (e.g. full stop) following the e-mail address. In the TM system, both the e-mail address and all content before the next segment end marker (colon) of the next line appear in one segment. This reduces the value of the segment for later reuse. If both lines are separated by a hard break as in the positive example, then the segments are separated correctly with respect to their status as information units. | |
Manual hyphenation
Recommendations:
— Manually inserting hyphens in order to influence layout should be avoided. Instead, the automatic hyphenation feature should be used.
— Non-breaking hyphens should be used instead of breaks in order to prevent an unwanted line break after a hyphen.
Hyphens are used to join words and to separate syllables of a single word. “Automatic hyphenation” is a software feature that automatically hyphenates words at the end of lines to improve the document’s appearance.
When automatic hyphenation is used, the hyphens shown in the text editor at the end of the line do not appear in the TM system or in the TM as hyphens. However, manually inserting hyphens to influence hyphenation can lead to issues in the TM system.
Additional hyphens manually inserted for layout purposes, e.g. to change where the line breaks automatically, do not influence segmentation, but will be imported into the TM system as a regular character.
EXAMPLE 1
Source language content in text editor | (–) The economics and history departments at the University are offering an interdisci- |
|---|---|
Source language content in TM system | The economics and history departments at the University are offering an interdisci-plinary seminar on Asia. |
Target language content in TM system [fr] | Les départements d’économie et d’histoire de l’Université proposent un séminaire interdisciplinaire sur l’Asie. |
Explanation | The hyphen inserted manually for layout reasons is displayed as a hyphen in the middle of the word “interdisciplinary” in the source segment. It does not affect the target language content, but if the source text is later updated and the hyphen deleted, it will lead to the TM match being reduced to a fuzzy match in a subsequent translation. Also, if “interdisciplinary" and its translations are saved in the termbase, term recognition will likely not work for “interdisci-plinary”. This can lead to incorrect or inconsistent use of terminology in the target language content. |
In order to keep the source text within the TM clean and reusable, the automatic hyphenation feature of the text editor should be used instead of manual hyphenation. Automatic hyphens will be displayed in the text editor, but will not be imported into the TM system as regular characters.
For certain hyphenated words or units of words separated by hyphens, it will make sense to avoid automatic line breaks. Disabling automatic line breaks can facilitate reading fluency and prevent related units from being placed on different lines.
Where a line break after a hyphen is unwanted, a so-called “non-breaking hyphen” or “no-break hyphen” can be inserted instead of a normal hyphen. Non-breaking hyphens should be used where the hyphenated part of the character string carried forward to the next line would be very short (e.g. single letters, abbreviations or numbers) or within certain formulas, such as those used in chemistry.
Non-breaking hyphens are usually displayed as such in the TM system and can be used also in the target language content, if applicable.
EXAMPLE 2
Source language content in text editor | (–) The teacher has provided the students with information sheets on how to calculate the k- (+) The teacher has provided the students with information sheets on how to calculate the |
|---|---|
Source language content in TM system | The teacher has provided the students with information sheets on how to calculate the k-factor. The teacher has provided the students with information sheets on how to calculate the k[-]factor. |
Target language content in text editor [de] | (–) Die Lehrkraft hat den Studierenden Informationsblätter bereitgestellt, mit denen sie den k-Faktor berechnen können. (+) Die Lehrkraft hat den Studierenden Informationsblätter bereitgestellt, mit denen sie den k‑Faktor berechnen können. |
Explanation | The non-breaking hyphen in the word “k-factor” prevents the automatic line break behind “k-“. Instead, the whole expression is moved to the next line. The same applies for the target language content if the non-breaking hyphen is used in the translation as well. |
Lists
Recommendation: Lists in the middle of a sentence should be avoided.
Where a sentence begins before and continues after a list, segments in many languages fail to align properly and misaligned segment pairs of no use for future translations will be stored in the TM. Misaligned segment pairs in the TM can lead to translation errors when reused without checking.
EXAMPLE
Source language content in text editor | (–) When mounting the holding plate — bolts, — fasteners, and — glue must be used. | |
|---|---|---|
Source language content in TM system | Segment 1 | When mounting the holding plate |
Segment 2 | bolts, | |
Segment 3 | fasteners, and | |
Segment 4 | glue | |
Segment 5 | must be used. | |
Target language content in TM system [fr] | Segment 1 | Lors du montage de la plaque de maintien, |
Segment 2 | des boulons, | |
Segment 3 | des fixations, et | |
Segment 4 | de la colle | |
Segment 5 | doivent être utilisés. | |
Explanation | In some languages, the target language content can be structured in the same way. However, in this French example, the verb “doivent” must be plural, because the list contains several items. The segment pair stored in the TM is “must be used – doivent être utilisés”. “must” can be used with singular and plural parts of speech, but “doivent” can only be used in a plural context. Additionally, “utilisés” will likely have to be inflected depending on the parts of speech to which it refers. | |
For more information and another example on lists inside sentences, see 4.4.2.5.
Tabs
Recommendation: Tabs should be avoided within sentences or lines of text when inserted for layout purposes. Instead, built-in text editor formatting features (columns, tables, indents, numbering, bullets, etc.) should be used.
Tabs inserted manually for layout purposes, e.g. to change where the line breaks automatically, do not influence segmentation, but will be imported into the TM system.
Also, source text and target text can vary in length, therefore the tabs can appear in a completely different position within the target text and fail to fulfil their intended purpose.
EXAMPLE
Source language content in text editor | (–) When changing tyres, make sure that your vehicle is on a level, stable surface, → → → |
|---|---|
Source language content in TM system | When changing tyres, make sure that your vehicle is on a level, stable surface, → → → use the jack correctly and tighten the wheel nuts securely. |
Target language content in TM system [de] | Achten Sie beim Reifenwechsel darauf, dass Ihr Fahrzeug auf einer ebenen, stabilen Fläche steht, → → → verwenden Sie den Wagenheber richtig und ziehen Sie die Radmuttern fest an. |
Target language content in text editor [de] | (–) Achten Sie beim Reifenwechsel darauf, dass Ihr Fahrzeug auf einer ebenen, stabilen Fläche steht, → → → verwenden Sie den Wagenheber richtig und ziehen Sie die Radmuttern fest an. |
Explanation | The tabs in the source language content in text editor help to increase the readability, by bringing the next task into a new line. However, they will cause unwanted gaps in the target language content in the text editor due to the different text lengths of German and English content. |
Spaces
Recommendations:
— Spaces should not be inserted for layout purposes. Instead, built-in text editor formatting features (columns, tables, indents, numbering, bullets, etc.) should be used.
— Non-breaking spaces should be used instead of normal spaces in order to prevent unwanted line breaks after a space.
— Additional spaces should be avoided following full stops (within acronyms and initialisms or dates).
Additional spaces manually inserted for layout purposes, e.g. to change where the line breaks automatically, do not influence segmentation, but will be imported into the TM system as a regular character.
Also, source text and target text can vary in length, or the order of components in content can move around due to different text conventions in the target text. Therefore, the spaces can appear in a completely different position within the target text and fail to fulfil their intended purpose.
EXAMPLE 1
Source language content in text editor | (–) When changing tyres, make sure that your vehicle is on a level, stable surface,························ |
|---|---|
Source language content in TM system | When changing tyres, make sure that your vehicle is on a level, stable surface,························use the jack correctly and tighten the wheel nuts securely. |
Target language content in TM system [fr] | Lorsque vous changez les pneus, assurez-vous que votre véhicule se trouve sur une surface plane et stable, ························utilisez correctement le cric et serrez bien les écrous de roue. |
Target language content in text editor | (–) Lorsque vous changez les pneus, assurez-vous que votre véhicule se trouve sur une surface plane et stable, ························utilisez correctement le cric et serrez bien les écrous de roue. |
Explanation | The spaces in the source language content in the text editor were perhaps intended to increase the readability by moving the next task to a new line. However, they will cause unwanted gaps in the target language content in the text editor due to the different text lengths for French and English. |
For certain lexical units separated by spaces, it will make sense to avoid automatic line breaks. Disabling automatic line breaks can facilitate reading fluency and prevent related units from being placed on different lines.
Where a line break after a space is unwanted, a so-called “non-breaking space” or “hard space” can be inserted instead of a normal space. Non-breaking spaces should be used when the text fragments separated by a space belong together and should not be separated across two lines for better readability or understanding (e.g. between the title and personal name (Ms Meyer, Dr Miller), between numeric values and units (44 mm, 5 %, 66 min), with special characters (3 + 5, ≤ 3), with specifications of time and dates (14th century, 30 June 2020), with document names or product names (ISO 17100) and in other cases (Version 3, Fig. 23, p. 18).
Non-breaking spaces are usually displayed as such in the TM system and can be used also in the target language content, if applicable.
EXAMPLE 2
Source language content in text editor | (–) The professor has provided the students with detailed information on how King Louis XIV died. (+) The professor has provided the students with detailed information on how King Louis°XIV died. |
|---|---|
Source language content in TM system | The professor has provided the students with detailed information on how King Louis XIV died. The professor has provided the students with detailed information on how King Louis[°]XIV died. |
Target language content in text editor [fr] | (–) Le professeur a fourni aux étudiants des informations détaillées sur la mort du roi Louis XIV. (+) Le professeur a fourni aux étudiants des informations détaillées sur la mort du roi Louis XIV. |
Explanation | The non-breaking space in the word “Louis XIV” prevents the automatic line break behind “Louis“. Instead, the whole proper name is moved to the next line. The same applies to the target language content if the non-breaking space is used in the target language as well. |
Especially when a date or other content (acronyms, initialisms, numbers, etc.) contains full stops, no additional space should be inserted after the full stop since it will lead to unwanted segmentation.
EXAMPLE 3
Source language content in text editor | (–) Since 01.11. 2022, the API supports OAuth 2. 0 for authentication. | |
|---|---|---|
Source language content in TM system | Segment 1 | Since 01.11. |
Segment 2 | 2022, the API supports OAuth 2. | |
Segment 3 | 0 for authentication. | |
Source language content in text editor | (+) Since 01.11.2022, the API supports OAuth 2.0 for authentication. | |
Source language content in TM system | Segment 1 | Since 01.11.2022, the API supports OAuth 2.0 for authentication. |
Explanation | If a full stop is used in combination with text, it is not interpreted as an end of sentence (as in www.iso.org, 20,000.00 €, etc.). The full stops within the date “01.11. 2022” and the version “2. 0” are interpreted by the TM system as an end of sentence, because they are followed by a space. In the positive example, both the date and the version are written without spaces interrupting them and no unwanted segmentation occurs. | |
Not all languages use single space characters. For example, in the writing structure and form of Persian graphemes, spacing between words, whether simple, derivative, or compound, is crucial in order to avoid misinterpretation and semantic ambiguity but providing the morphological clarity. In Persian, as one of the right-to-left languages, half space character plays a significant role in word structures where full space characters would lead to inappropriate interpretations.
Generally, the most common reasons regarding the half space character in the Persian writing system, are as below:
a) maintaining lexical coherence in affixations;
b) maintaining lexical coherence in inverted genitive compounds;
c) maintaining lexical coherence in adjective compounds;
d) substituting for a hyphen;
e) preserving the aesthetic and formal aspects of the written form.
EXAMPLE 4
Source language content (en) | Target language content [fa] |
(+) insensitivity | (+) بیحساسیتی (–)بی حساسیتی |
(+) data warehouse | (+) دادهانبار (–) داده انبار |
(+) wide body (aircraft) | (+)(هواگرد) پهنپیکر (–)(هواگرد) پهن پیکر |
(+) three-point landing | (+) نشست سهنقطه (–)نشست سه نقطه |
(+) preconditioning | (+)پیششرطیشدگی (–)پیششرطیشدگی/ پیششرطی شدگی |
Explanation: The half space will be more prominent when it is a meaning-distinguishing element. In some cases, they will make no change in the meaning but are inaccurate in the written form. | |
Full stops following abbreviations
Recommendation: Abbreviations using full stops should be avoided.
A full stop is used to mark the end of a sentence, at the end of word abbreviations, in acronyms and initialisms or in a series as an ellipsis. Full stops can be interpreted as punctuation marks and lead to unwanted segmentation when translation is performed using CAT tools.
Using unusual abbreviations or spelling abbreviations incorrectly can cause a sentence to be divided into several segments. This may require manual editing after translation or merging of segments during translation.
Established abbreviations (etc., e.g., i.e.) are an exception, as these are usually automatically recognised by text editors and the segmentation rules used in TM systems.
Abbreviated words can also be misinterpreted and mistranslated.
EXAMPLE
Source language content in text editor | (–) The spinner is constructed from softer grades of alumin. to reduce the tendency to crack in service. | |
|---|---|---|
Source language content in TM system | Segment 1 | The spinner is constructed from softer grades of alumin. |
Segment 2 | to reduce the tendency to crack in service. | |
Target language content in TM system [ja] | Segment 1 | スピナーは、使用中にクラックが発生しやすくなるのを抑えるため、 |
Segment 2 | より柔軟性の高いアルミニウムで構成されている。 | |
Source language content in text editor | (+) The spinner is constructed from softer grades of aluminium to reduce the tendency to crack in service. | |
Source language content in TM system | Segment 1 | The spinner is constructed from softer grades of aluminium to reduce the tendency to crack in service. |
Target language content in TM system [ja] | Segment 1 | スピナーは、使用中にクラックが発生しやすくなるのを抑えるため、より柔軟性の高いアルミニウムで構成されている。 |
Explanation | The full stop in the word “alumin.” is interpreted as an end of sentence and therefore a segment end by the CAT tool. In the negative example, the sentence that should be translated and stored in the TM as a single sentence, is split into two segments. In some languages such as Japanese, the sentence structure is different and forms incorrect segment pairs that should not be stored in the TM for reuse. | |
For more information and other examples on abbreviations, see 4.3.3 and 4.4.3.1.
4.1.4 Formatting graphic layout of texts
General
Manual formatting for text layout purposes should be avoided because it often leads to unnecessary and expensive rework in the target text. As a rule, manual formatting must be undone and the text readjusted.
Symbols
Recommendation: Symbols should be inserted in line with the text. They should not hover over several spaces or tabs.
Symbols are often integrated into continuous text, e.g. to clarify instructions or explain buttons. Since source text and target text can vary in length, a symbol hovering over spaces or tabs text can appear in a completely different position in the target text and fail to fulfil their intended purpose.
EXAMPLE
Source language content in text editor |
(–) Click·on··<graphic>
(+) Click·on· <graphic> |
|---|---|
Source language content in TM system | (–) Click·on ············ to·go to the·next·page. (+) Click·on ·<Symbol> to·go to the·next·page. |
Target language content in TM system [ja] | (–) ············をクリックすると次のページに進みます。 (+) <Symbol> をクリックすると次のページに進みます。 |
Target language content in text editor [ja] |
(–) ············をクリッ<graphic> (+)<graphic> |
Explanation: In the negative example, the symbol is not formatted to be in line with the text. It is therefore not imported into the CAT tool as something that can be placed in the target language content properly. If the symbol is hovering over spaces (·) or tabs (🠒), it will not be imported and appear at the very same place in the file and therefore very likely end up hovering at what appears to be an odd spot in the target language content due to different text length or sentence structure. | |
Indents
Recommendation: Text should not be indented using tabs or spaces. Instead, built-in text editor features (table tools, indents, column layout, right/left alignment, etc.) should be used.
In text formatting, either tabs or spaces are often used to indent text or to create tabular structures and columns. However, source text and target text can vary in length, therefore the tabs and spaces often fail to fulfil their intended purpose in the target text.
EXAMPLE
Source language content in text editor | (–) Measurements of the box: Length····3 000 cm Width·····200 cm Height····4 cm | |
|---|---|---|
Source language content in TM system | Segment 1 | Measurements of the box: |
Segment 2 | Length····3 000 cm | |
Segment 3 | Width·····200 cm | |
Segment 4 | Height····4 cm | |
Target language content in TM system [fr] | Segment 1 | Dimensions de la boîte: |
Segment 2 | Longueur····3 000 cm | |
Segment 3 | Largeur·····200 cm | |
Segment 4 | Hauteur····4 cm | |
Target language content in text editor [fr] | (–) Dimensions de la boîte: Longueur····3 000 cm Largeur·····200 cm Hauteur····4 cm | |
Explanation | In this example, instead of using column or table layout, the numbers and units are indented in order for the numbers to be vertically aligned with each other. However, when the same number of spaces is used in the translation, the layout will not work anymore in the target language content in the text editor due to different text length. Also, the matches would have a higher value for reuse if, for example, “Length” and “3 000 cm” were individual segments. | |
Tables
Recommendation: Fixed row heights should be avoided.
Where tables are used, care should be taken not to set fixed row heights, because fixed row heights will cause sentences that run longer in the target text than in the source text to be hidden by the table frame.
EXAMPLE
Source language content | Target language content [de] |
(–) Artificial neural networks are a fundamental component of deep learning, mimicking the human brain to process complex data. | (–) Künstliche neuronale Netzwerke sind ein grundlegender Bestandteil des Deep Learning und ahmen das menschliche Gehirn nach, um komplexe Daten zu verarbeiten. |
Explanation: The row height of tables is fixed and cuts off the text in the target language content that has a longer text length than the source language content. In this example, it is visible to readers that the text is partially cut off and an editor should intervene. If more text were to be cut off below, important information would not be displayed. Since translation is usually performed using CAT tools, this layout issue will not be noticed unless the target language context is thoroughly checked in the text editor. | |
Text boxes
Recommendation: Text boxes should be configured to resize automatically to match the length of the text.
Where text boxes are used (e.g. in graphics), care should be taken to ensure they are integrated into the graphics or text and configured in such a way that longer target text is not cut off at the edges of the text box.
EXAMPLE
Source language content | Target language content [fr] |
Explanation: The text box does not resize automatically and cuts off the text in the target language content that has a longer text length than the source language content. In this example, it is visible to readers that the text is partially cut off and an editor should intervene. If more text were to be cut off below, important information would not be displayed. Since translation is usually performed using CAT tools, this layout issue will not be noticed unless the target language context is thoroughly checked in the text editor. | |
Using international character sets
Recommendations:
— International character sets should be used that support all target languages.
— The operating system should be set to accommodate the appropriate font to display the character set.
Wherever possible, international character sets (e.g. Unicode, see ISO/IEC 10646) and encodings (e.g. UTF‑8) should be used in specialized content intended for translation. During the production stage of a document intended for translation, care should be taken to ensure that the character sets used support all the target languages planned, e.g. to avoid the use of different fonts for different target contents. Where target languages are not supported in the font used in the source content, diacritical characters (e.g. ń, ľ, ř, ǿ) or the entire writing system of the target language (e.g. Arabic, Chinese, Greek, Persian) either cannot be recognised or can fail to display correctly. For some target languages, it can be necessary to change the character set or encoding individually after translation (e.g. UTF-16 for big-endian Chinese).
EXAMPLE
Source language content | Target language content [sk] |
Diacritical marks and fonts play an important role in the rendering of texts. | (–) Diakritick☐ znamienka a typy p☐sma zohr☐vaj☐ pri zobrazovan☐ textov v☐znamn☐ ☐lohu. (+) Diakritické znamienka a typy písma zohrávajú pri zobrazovaní textov významnú úlohu. |
Explanation: The character set chosen in the negative example (possibly a specially designed corporate font) does not support special characters used in other languages. As a result, for some letters or all letters, only placeholders are displayed in the text editor. | |
4.1.5 Formatting text content and references
Multilingual source texts
Recommendation: Multilingual source texts should be avoided.
Sometimes source texts contain text elements in different languages. Avoiding such multilingualism in source texts is highly advisable, because it interferes with the translation process in several ways:
a) Additional time and effort for quoting
All elements irrelevant to the translation should be hidden or deleted from the source file, in order to determine the actual amount of text intended for translation and costs. These measures result in additional preparatory work. If this preparatory work has to be performed in a file format from a software application to which the translation service provider has no access, the preparatory work is limited to the functions the TM systems offer after the file has been imported into the TM system. This leads to further additional time and effort post translation, e.g. reinserting into the target text any deleted text elements that were not represented in the main source language.
b) TM impurity
Segment pairs that contain content in languages that do not correspond to the TM language pair are automatically stored in the TM. Thus, multilingual source language content can give rise to impurity in the TM. Such TM content reduces the match value of segments in other source language content composed in only one language.
Some CAT tools have functionalities to deal with this issue, but the problem requires additional work prior to or during the translation process, and should thus be avoided.
c) Unsuitability for machine translation
Multilingual source texts are not suitable for machine translation, because machine translation systems may not recognise which source language is relevant to the translation, even where the language pair has been set correctly.
Multilingual source texts should be avoided, because the translation service provider will be required either to review each multilingual source text segment to establish which text fragment is relevant to the translation or to modify any text fragments erroneously translated by the MT system.
EXAMPLE
Source language content [de/en] | Target language content [fr/en] |
(–) Dokumenten-Nr./Document No.: Ersteller/Created by: Datum/Date: | (–) N° de document/Document No : Créateur/Créé par : Datum/Date : |
Explanation: In this example, the German parts of the English source text are to be replaced by French, but the MT system was not able to recognize correctly the content relevant for translation. The first line is correct, but in the second line everything was translated (instead of only the German part) and in the third line, the German part is untranslated. | |
Cross-references
Recommendation: Cross-references should not be inserted manually. Instead, the built-in text editor formatting features should be used.
Cross-references relate to other passages in a text, e.g. headings, footnotes, quotations, index entries and other field codes. A footnote or endnote can be inserted as a cross-reference, or cross-references can also be introduced by the words “see”, “compare” or the abbreviation “cf.”.
For cross-references in specialized content, the built-in text editor formatting features should be used so that the cross-references are automatically numbered in the target text, and indices and lists will be updated automatically and not have to be updated manually. Automatic numbering is especially relevant in translation, because the same content will expand or retract in different languages and because identical pagination (cross-references to page numbers) cannot be retained readily across languages.
If cross-references are made to headings or other parts of the text, it is critical that the target language content be consistent by using cross-references that are automatically updated.
EXAMPLE
Source language content in text editor | (–) <title_ref1> Installing the control cabinet</title_ref1> […] Follow the instructions in the Section "Installing the control cabinet" on p. 6. | |
|---|---|---|
Source language content in TM system | Segment 1 | Installing the control cabinet |
Segment 2 | Follow the instructions in the Section "Installing the control cabinet" on p. 6. | |
Target language content in TM system [de] | Segment 1 | Installation des Steuerschranks |
Segment 2 | Beachten Sie die Hinweise im Kapitel „Schaltschrank montieren“ auf S. 6. | |
Source language content in text editor | (+) <title_ref1> Installing the control cabinet</title_ref1> […] Follow the instructions in the Section "<ref [title_ref1]/>" on p. <ref [page_ref1]/>. | |
Source language content in TM system | Segment 1 | Installing the control cabinet |
Segment 2 | Follow the instructions in the Section "[placeable1]" on p. [placeable2]. | |
Target language content in TM system [de] | Segment 1 | Installation des Steuerschranks |
Segment 2 | Beachten Sie die Hinweise im Kapitel „[placeable1]“ auf S. [placeable2]. | |
Explanation | In the negative example, the section title “Installing the control cabinet” and page number are cross-referenced to a sentence in a different part of the document. The cross-references are, however, not made using the built-in text editor features and therefore are displayed as plain text in the TM system. As a result, in the target language content the section title cross-referenced to is translated differently than the section title itself. Furthermore, in the target text the section will likely no longer be on page 6, but on page 7 due to different text length. This makes it more difficult for readers to find the correct section. In the positive example, the section title and page number are cross-referenced using the built-in text editor features and displayed in the TM system as placeables. As a result, the section title and the page number will automatically be correct where they are being cross-referenced in the target text. | |
4.2 Terminology
4.2.1 General
Terminology is a critical component of specialized knowledge and therefore a critical element in specialized content, which serves to convey information and knowledge. Consequently, terminology should be consistent and unambiguous. Using synonyms causes confusion and queries; using homonyms or polysemes can cause misunderstandings. The terminology used in specialized content is also subject to further requirements, which are described in relevant standards such as ISO 704: terms should be linguistically correct, precise, transparent, appropriate, neutral and concise and should avoid excessive recourse to foreign language elements.
When specialized content is drafted, due consideration should be given to these terminological requirements so that the specialized content fulfils its communicative function and is comprehensible. Where texts are intended for translation into several languages, using correct and consistent terminology is of particular importance. Using terminology consistently when working with TM systems also ensures better fuzzy matches.
4.2.2 Synonymy
Recommendation: Synonyms should be avoided.
Using synonymous terms in the source text will not only impair comprehension of source text content, which is required for any translation. It can also result in different (synonymous) terms in the target language, which can compromise terminological clarity.
When TM systems are used, synonymous terms often lead to a poorer hit rate in the (terminology) search in segments that have already been translated and entail, by extension, additional time and effort.
Machine translation can also produce poorer or even incorrect results if the MT system has not been trained for the same synonymous terms and the same frequencies.
Large translation projects often employ terminology extraction programs to identify new domain-specific terms and to determine their target-language equivalents. The frequent use of synonyms in these cases complicates the terminology management process.
EXAMPLE
Source language content | Target language content [it] |
(–) Section 2 shows you how to create and edit concept entries. A terminological entry consists of specific units of information, such as terms, definitions, and contexts. | (–) Il capitolo 2 mostra come creare e modificare le schede concettuali. Una scheda terminologica è costituita da unità specifiche di informazioni, come termini, definizioni e contesti. |
(+) Section 2 shows you how to create and edit terminological entries. A terminological entry consists of specific units of information, such as terms, definitions, and contexts. | (+) Il capitolo 2 mostra come creare e modificare le schede terminologiche. Una scheda terminologica è costituita da unità specifiche di informazioni, come termini, definizioni e contesti. |
Explanation: The terms “concept entry “and “terminological entry” are synonyms. However, both an MT engine and a human translator may not recognize this. Consequently, they might choose translations that best convey the individual meaning of each term into the target language, potentially selecting terms that are less common in the specific context. In this example, “scheda concettuale” is a precise translation of “concept entry” but is uncommon in the domain of terminology science. This inconsistency negatively affects the quality of the TM and of all future translations that use these terms. | |
4.2.3 Uncommon or unknown abbreviations
Recommendation: Uncommon or unknown abbreviations should be avoided.
Abbreviations should be used in texts only if they are commonly used or are familiar to the experts or the intended readership. Where abbreviations are common or familiar, using abbreviations is not inappropriate, because reading comprehension will not be impaired, and corresponding equivalents will be known in the target language.
Often, however, the long forms of words or phrases are spelled out only when they first occur in a text. They are then assigned new, ad hoc abbreviations, which are systematically used in the remainder of the text in place of the long form. If the entirety of a section of specialized content is to be translated into another language, using such abbreviations can be unproblematic; however, if only parts of the entire text, e.g. samples excerpted from a content management system, are exported for translation, the reference to the long form can inadvertently be omitted. This omitted reference can result in time-consuming research effort and discrepancies between source and target language content or between the not exported old and the newly translated parts of the target language content.
Especially when MT systems are used, unknown or uncommon abbreviations are often mistranslated or not translated at all.
EXAMPLE
Source language content | Target language content [it] |
(–) The TT and ST can vary in length. | (–) La TT e la ST possono avere lunghezze diverse. |
(+) The target text and the source text can vary in length. | (+) Il testo di arrivo e il testo di partenza possono avere lunghezze diverse. |
Explanation: Depending on the availability of contextual information, the abbreviations “TT” and “ST” can be unknown or unclear. Consequently, they can be translated incorrectly, especially when MT systems are used. In this example, they stand for “target text” and “source text” and the correct Italian equivalents for “TT” and “ST” are “TA” and “TP”. Using the full forms as in the positive example avoids mistranslations. | |
For more information and other examples on abbreviations, see 0 and 4.4.3.1.
4.2.4 Orthographic variants
Recommendation: Orthographic variants should be avoided.
The use of orthographic variants, even if permitted by standard reference works, deprives specialized content of a uniform look and feel. Orthographic variants normally impair neither reading comprehension nor the translation of the text, but some variant spellings are actually words with different meanings. Orthographic variants include not only different spellings (e.g. toward and towards), but also hyphenation rules.
However, where computer-assisted methods are used, orthographic variants, like synonymous terms, will have a negative impact on the efficiency and reliability of these methods (see 4.3.2).
EXAMPLE
Source language content | Target language content [de] |
(1a) Any non-conformity found must be classified. (1b) Any nonconformity found must be classified. (2a) Any non-conformance found must be classified. (2b) Any nonconformance found must be classified. | (1a) Jede festgestellte Nichtkonformität muss klassifiziert werden. (1b) Jede festgestellte Nicht-Konformität muss klassifiziert werden. (2a) Jede festgestellte Nichtübereinstimmung muss klassifiziert werden. (2b) Jede festgestellte Nicht-Übereinstimmung muss klassifiziert werden. |
Explanation: The orthographic variants used within the source language content can lead to inconsistencies in the target language content. Some variations, especially with hyphenation, can cause term recognition errors in CAT tools and lead to terminological inconsistencies in target language content and with the termbase. | |
4.2.5 Ambiguities
Recommendations:
— Potential ambiguities caused by polysemes or homonyms should be clarified by indicating the relevant domain, describing the underlying concept or providing an appropriate comment.
— Termbases containing polysemes or homonyms should provide relevant contextual information to support translators in distinguishing which equivalent to use.
Ambiguous (homonymous or polysemous) terms exacerbate potential misunderstanding and can lead to severe misinterpretations in specialized content. The ability to understand and correctly translate the source text requires that these ambiguities be resolved. The same is true even if the same ambiguities exist in the target language.
EXAMPLE 1
Source language content | Target language content [de] |
mouse <biology> | Maus <biology> |
mouse <technology> | Maus <technology> |
Explanation: If there is no context information, the word “mouse” could be translated in different ways while in a specific translation project only one is correct. However, in German for example, in both contexts the target language content would be the same. | |
Context and subject-matter expertise resolve most ambiguities (in example 1: biology vs. technology), but individual terms often require translation outside of a specific context – e.g. materials lists or item master data lists. In such cases, ambiguities can cause substantial discrepancies of meaning in the target language, especially in cases involving CAT (see also 4.3.2).
EXAMPLE 2
Source language content | Target language content [de] |
(–) driver | (1) Treiber (2) Fahrerin/Fahrer (3) Driver |
(+) driver <information technology> | (+) Treiber <Informationstechnologie> |
(+) driver <transport> | (+) Fahrerin bzw. Fahrer <Verkehrswesen> |
(+) driver <sports> | (+) Driver <Sport> |
Explanation: In English, the term “driver” alone creates ambiguities: It is used to designate several concepts in different domains. In German, there are different terms used as equivalents that are not interchangeable. Both humans and computers need some context to decide which target language term is correct. The term “driver”, combined with information about the relevant domain, enables a solid decision on the right target language term. | |
Not only are there sometimes several possible variants for a certain part of speech, sometimes the part of speech itself is unclear which opens up even more variants.
EXAMPLE 3
Source language content | Target language content [de] |
(–) right | (1) richtig (2) rechts (3) Recht |
(+) correct | (+) richtig |
(+) righthand | (+) rechts |
(+) right | (+) Recht |
Explanation: If there is no contextual information, the word “right” can have different meanings (e.g. correct, righthand or moral or legal entitlement) and be an adjective, an adverb or a noun. Thus, the term can be translated in different ways depending on the context, but in any specific translation project, only one interpretation is correct. | |
In some cases, a language splits the conceptual reference associated with one term in another language into two concepts.
EXAMPLE 4
Source language content | Target language content [de] |
runway [at an airport] | (1) Startbahn (2) Landebahn |
Explanation: In English, for example, the area for the aircraft landing and take-off is called “runway”, whereas in German, a differentiation can be made. The runway can be called “Start- und Landebahn” (runway for take-off and landing) as well as “Startbahn” (runway for take-off) or “Landebahn” (runway for landing). The take-off process needs a longer runway, and in rare cases, such as the existence of an obstacle or special requirement for taxiing, some runways may require exclusive use for take-offs or landings. A translator in this case must know how to correctly translate “runway” and needs additional information. Since the customer is usually not aware of target language ambiguities, a query will likely be raised. The same goes, of course, for translations into English: In Vietnamese, “xanh” describes the color of the blue sky and of the green leaves of a tree. An additional description helps to specify the color, for example “xanh lá cây” for green (cây = plant) or “xanh da trời” for blue (trời = sky). | |
Hidden ambiguities can be identified and resolved only by consulting with the (in-house or external) translation service provider or in-country reviewers. In enterprise management environments, style guides and termbases can provide guidance on frequent target-language ambiguities involving polysemes and homonyms in frequently used language pairs.
4.2.6 Compound terms and constructions
Recommendation: The relationships in compound terms and constructions should be transparent and unambiguous.
Compound terms and constructions are often used to denote complex concepts clearly in some languages, whereas some other languages use different syntactic structures to express relations between semantic components. Particularly in the case of (excessively long) compounds; however, the coherence of individual components is frequently unclear. It is necessary to understand the relations between the components to find the correct equivalent, especially in languages where these relations have to be made explicit.
The headword should be identified and placed at the beginning of the noun-string. Verbal nouns should be replaced by infinitive or participial constructions together with the appropriate prepositions.
EXAMPLE 1
Source language content | Target language content [de] |
(–) real-time data transmission process | (1) Prozess Echtzeit-Datenübertragung (2) Prozess Übertragung Echtzeitdaten |
(+) process for real-time transmission of data | (+) Prozess zur Echtzeit-Übertragung von Daten |
Explanation: In the negative example, it is unclear whether “real-time” refers to “data” or “transmission”. In the positive example. the genitive relations are unambiguous because of the function words and because the multi-word compound was converted into a multi-word term. Care should be taken not to revise established compound terms and constructions. | |
EXAMPLE 2
Source language content | Target language content [es] |
(–) Emerald Ash Borer elimination trunk injection method | (1) método de eliminar la inyección del barrenador esmeralda del fresno en el tronco (2) método de inyección en el tronco para la eliminación del barrenador esmeralda del fresno |
(+) method for injecting ash trunks in order to eliminate the Emerald Ash Borer | (+) método de inyección en el tronco para la eliminación del barrenador esmeralda del fresno. |
Explanation: In the negative example, it is unclear whether “elimination” refers to “injection” (variant 1) or to “Emerald Ash Borer” (variant 2). This can lead to translation errors, especially with machine translation systems. In the positive example, the ambiguity is avoided, because all relations between the elements are made explicit. | |
4.2.7 Assuring correct use of terminology
Terminology management systems support term choice, indicate preferred terms, can indicate which abbreviations should be replaced by long forms, and show which orthographic variants are the required ones. Such termbases should be available to anyone involved or interested in text production. Components integrated into the editorial tools should provide terminology support during the writing process and should verify the terminology as part of the quality assurance process.
A carefully maintained termbase should also help to identify and to avoid ambiguities because ideally it will be possible to identify if there are multiple concept-oriented entries that list the same term in that termbase.
Because most of the terms in these termbases should be documented using additional data categories such as definitions, context and multimedia elements (e.g. figures/images or videos), the stakeholders involved in the translation process will also benefit from having access to these termbases.
Ideally, the external or in-house translator or translation service provider should be integrated into the enterprise terminology workflow so that they will
a) have access to terminological data;
b) use these data for the translation;
c) where appropriate, provide feedback regarding possible improvements or updates;
d) enter into the termbase the results of their target language terminology work required by the translation.
These efforts will contribute to the creation of a multilingual terminology resource which not only can be used for other translation projects, with other translators, but will also ensure that clear and correct enterprise terminology will be used across all documents in all languages. Furthermore, the terminology resource will also aid the person who produces the source language text in his or her translation-oriented writing effort.
4.3 Grammar, syntax and style
4.3.1 General
At the linguistic level, guidelines can facilitate and simplify the translation process, make it more cost-efficient and prevent translation errors.
4.3.2 Sentence structure
Translation technology is best supported by uniform and clear, complete sentence structures.
If the same statement or topic recurs in a given text genre (machine instructions, package inserts, business contracts), the same text chunk should be identified and stored as an information unit so that it can be reused where appropriate. This practice increases efficiency because the unit in question will come up in the translation memory as needed and not require retranslation.
Translators using translation memory suggestions will find it easier to interpret clear, concise suggestion segments if wording and terminology are consistent. Both human and machine translation produce fewer translation errors if sentences are concise, and text content is coherent and cohesive.
Sentence length
Recommendation: Sentences should be as complete as necessary and as concise as possible.
Short sentences enhance clarity and facilitate easier comprehension for readers and translators, ensuring that complex information is communicated effectively.
EXAMPLE
Source language content | Target language content [de] |
(–) Our products are of high quality and useful for various application scenarios, all of which can be addressed with our products, such as home use, industrial use or use in educational contexts. | (–) Unsere Produkte sind von hoher Qualität und eignen sich für verschiedene Anwendungsgebiete, von denen alle mit unseren Produkten bearbeitet werden können, wie etwa den Gebrauch im Haushalt, die industrielle Nutzung oder die Verwendung im Zusammenhang mit Bildung. |
(+) Our products are of high quality. They are useful for various application scenarios such as home use, industrial use or use in educational contexts. Our products address all of these application scenarios. | (+) Unsere Produkte sind von hoher Qualität. Sie eignen sich für verschiedene Anwendungsgebiete wie etwa den Gebrauch im Haushalt, die industrielle Nutzung oder die Verwendung im Zusammenhang mit Bildung. Mit unseren Produkten können Sie all diese Anwendungsgebiete bearbeiten. |
Explanation: The negative example in the source language and in the target language consists of 32 words and 37 words, respectively. These long sentences can be difficult to understand because they contain multiple clauses, complex structures, and convoluted syntax. They make it hard for readers to process and retain information. In the positive example, all sentences are considerably shorter and thus easier to understand and to translate. | |
Subordinate clauses
Recommendation: Main clauses should not be interrupted by subordinate clauses.
Complex sentences often contain nested subclauses, making it challenging for readers to parse and interpret the syntactic structure. In intricate sentences, the relationships between clauses can become ambiguous, leading to confusion about the intended meaning. Ambiguity can arise from unclear references, ambiguous modifiers, or convoluted sentence structures. Furthermore, structures that can make sense in the source language can in some cases be impossible to maintain in the target language.
EXAMPLE
Source language content | Target language content [de] |
(–) Upon receiving user input, the application will initiate a series of validation checks, which assess the data integrity and compliance with predefined standards, before proceeding to execute the specified task. | (–) Nach Eingang der Benutzereingaben leitet die Anwendung eine Reihe von Validierungsprüfungen ein, die die Integrität der Daten und die Übereinstimmung mit den vordefinierten Standards bewerten, bevor sie mit der Ausführung der angegebenen Aufgabe fortfährt. |
(+) Upon receiving user input, the application will initiate a series of validation checks. These checks assess the data integrity and compliance with predefined standards. Only after these checks have been successfully completed will the application proceed to execute the specified task. | (+) Nach Eingang der Benutzereingaben leitet die Anwendung eine Reihe von Validierungsprüfungen ein. Bei diesen Prüfungen werden die Integrität der Daten und die Übereinstimmung mit den vordefinierten Standards bewertet. Erst wenn diese Prüfungen erfolgreich abgeschlossen sind, fährt die Anwendung mit der Ausführung der angegebenen Aufgabe fort. |
Explanation: Lengthy sentences can detract from the main point or message of the sentence if they are interrupted by one or more subordinate clauses. Readers can lose track of the primary idea amidst the complexity, resulting in reduced comprehension and retention of information. If the sentence is split into several sentences without subordinate clauses or if the main sentence does not continue after a subordinate clause, the information is easier to understand. | |
Verb location
Recommendation: The verb should be as close to the subject as possible.
The verb should not be too far away from the subject in a sentence because it helps maintain clarity and coherence, enabling readers and translators to quickly grasp the relationship between the subject and the action or state described by the verb.
EXAMPLE
Source language content | Target language content [it] |
(–) A summary graph illustrating overall system health and status within the diagnostic report generated by the system's monitoring tool is presented at the end. | (–) Un grafico riassuntivo, che illustra la salute e lo stato generale del sistema all’interno del rapporto diagnostico generato dallo strumento di monitoraggio del sistema, viene presentato alla fine. |
(+) The system’s monitoring tool generates a diagnostic report, at the end of which you can find a summary graph illustrating overall system health and status. | (+) Alla fine del rapporto diagnostico, generato dallo strumento di monitoraggio del sistema, è possibile trovare un grafico riassuntivo che illustra lo stato di salute e lo stato generale del sistema. |
Explanation: In the negative example, the verb “is presented” is too far away from the subject “a summary graph”. This makes the sentence more complex and potentially harder to understand what noun the verb refers to. | |
Sentence fragments without verbs
Recommendation: Sentence fragments without verbs should be avoided.
Sentence fragments without verbs are not only grammatically incomplete, but also verbs are essential for expressing actions or states of being. Without a verb, a sentence can feel incomplete and fail to convey a clear message.
EXAMPLE
Source language content | Target language content [de] |
(–) Heat exchanger leaky? | (–) Wärmetauscher undicht? |
(+) Check whether the heat exchanger is leaking. | (+) Prüfen Sie, ob der Wärmetauscher undicht ist. |
Explanation: In the negative example, it is not clear what readers are supposed to do exactly or if readers are at all responsible for any action. The positive example provides unambiguous information to readers and translators. | |
Lists inside sentences
Recommendation: Lists should always be introduced with a clear introductory sentence. They should never be positioned in the middle of a sentence.
Where a sentence is interrupted by too many items within a list, content comprehension will be impaired, because relationships will be unclear.
EXAMPLE
Source language content | Target language content [de] |
(–) Check — exhaust pipe, — metal hose, — clamps and — suspension components for condition and correct installation. | (–) Prüfen Sie — Abgasrohr, — Metallschlauch, — Schellen und — Aufhängungen auf Zustand und vorschriftsmäßige Montage. |
(+) Check the following components for condition and correct installation: — exhaust pipe — metal hose — clamps — suspension components | (+) Prüfen Sie folgende Bauteile auf Zustand und vorschriftsmäßige Montage: — Abgasrohr — Metallschlauch — Schellen — Aufhängungen |
Explanation: Other than the segmentation issues explained in 4.2.2.4, sentences interrupted by lists can be difficult to understand, because readers lose focus on the main point of the sentence, as their attention is diverted to the listed items. Readers as well as translators have to navigate back and forth between the main sentence and the listed items. Also in this case, the verb is too far away from the subject (See 4.4.2.5). | |
For more information and another example on lists inside sentences, see 4.2.2.4.
4.3.3 Word choice and word formation
Word choice and word formation will affect how the text is received. Incorrect word formation will impede, even prevent, communication and correct translation.
Ambiguous abbreviations
Recommendation: Abbreviations that are ambiguous or that have not been nationally or internationally standardized should be avoided.
Abbreviations are used to make text more compact, or to enhance linguistic efficiency in specialized content. Where abbreviations are unavoidable, a list of abbreviations shall be provided to readers and translator as part of the translation project, especially for organisation-specific and subject-matter-specific abbreviations.
EXAMPLE
Source language content | Target language content [it] |
(–) PM excels at managing complex projects and building meaningful relationships. | (–) Il/la PM eccelle nella gestione di progetti complessi e nella costruzione di relazioni significative. |
(+) The Peer Mentor excels at managing complex projects and building meaningful relationships. | (+) Il/la mentore alla pari (peer mentor) eccelle nella gestione di progetti complessi e nella costruzione di relazioni significative. |
Explanation: In English, “PM” is an abbreviation for “Prime Minister”, “Project Manager” and “Private Message”. This abbreviation is widely used and can also be found in dictionaries. However, in the corporate context, for example, “PM” is used for both project managers and peer mentors. This can lead to ambiguity. The Italian translation of “PM” is also “PM”, which stands for “Pubblico Ministero”, “Project Manager”, "Posta militare”, “Polizia Militare” and “Pontefice Massimo”, but not (yet) for “mentore alla pari”: it would therefore be better to use the full forms to avoid mistranslations. | |
For more information and other examples on abbreviations, see 0 and 4.3.3.
For more information on how to assure correct use of terminology (including abbreviations), see 4.3.7.
Non-specific terms
Recommendation: Specific terms should be used.
Even in specialized content, it is quite common to switch from a full form to a shorter form of the term (example 1) or to switch from a specific to a more general term (example 2) over the course of the text. However, since it is often the case that only parts of specialized content are translated, specific terms should be preferred and used consistently throughout the text.
EXAMPLE 1
Source language content | Target language content [it] |
(–) A laser printer uses a laser because lasers are able to form highly focused, precise, and intense beams of light. The stream of data held in the printer's memory rapidly turns the laser on and off as it sweeps. | (–) Una stampante laser utilizza un raggio laser in quanto i laser sono in grado di generare fasci di luce altamente focalizzati, precisi e intensi. Il flusso di dati contenuti nella memoria della stampante accende e spegne rapidamente il laser mentre si muove. |
(+) A laser printer uses a laser because lasers are able to form highly focused, precise, and intense beams of light. The stream of data held in the laser printer's memory rapidly turns the laser on and off as it sweeps. | (+) Una stampante laser utilizza un raggio laser in quanto i laser sono in grado di generare fasci di luce altamente focalizzati, precisi e intensi. Il flusso di dati contenuti nella memoria della stampante laser accende e spegne rapidamente il laser mentre si muove |
Explanation: The term “laser printer” is shortened to “printer” over the course of the text. Even if the reader of the full text will know, which type of printer is meant and that the shortened form refers to laser printer, this relation is lost if only the second sentence has to be translated. “printer” refers to the subordinate concept of “laser printer” and therefore to a different concept with a different equivalence in the target language. Therefore, the full form of the term should be used consistently to avoid ambiguity and confusion. | |
EXAMPLE 2
Source language content | Target language content [it] |
(–) To turn off the tablet, press and hold the button on the left side for 5 seconds. If the mobile device is turned off and supplied with external power, it will turn on automatically. | (–) Per spegnere il tablet, tenere premuto il pulsante sul lato sinistro per 5 secondi. Se il dispositivo mobile è spento e viene alimentato con corrente esterna, si accende automaticamente. |
(+) To turn off the tablet, press and hold the button on the left side for 5 seconds. If the tablet is turned off and supplied with external power, it will turn on automatically. | (+) Per spegnere il tablet, tenere premuto il pulsante sul lato sinistro per 5 secondi. Se il tablet è spento e viene alimentato con corrente esterna, si accende automaticamente. |
Explanation: A “mobile device” as well as a “dispositivo mobile” are a generic term for all types of handheld computers that are portable and can access networks without a wired connection, such as tablets, smartphones, smartwatches, or e-readers. This term can be used when the text refers collectively to all types of tablets, smartphones, and other similar mobile devices. Therefore, if a text is about a specific class of mobile devices, the specific term should be used consistently to avoid ambiguity and confusion. | |
Regionalisms
Recommendation: Regionalisms should only be used when essential for conveying the intended message, and translation project specifications should outline how these references should be handled in the target language content.
In specialized content, regionalisms will often lead to confusion or misunderstanding.
EXAMPLE
Source language content | Target language content [fa] |
(–) The cafeteria sells sandwiches and coke. (–) The cafeteria sells sandwiches and tonic. | (–)کافه ساندویچ و کوکاکولا میفروشد. (–)کافه ساندویچ و آبتونیک میفروشد. |
(+) The cafeteria sells sandwiches and soft drinks. | (+)کافه ساندویچ و نوشیدنی میفروشد. |
Explanation: The negative examples reflect the regional usage, in that “coke” in Georgia is any carbonated beverage, and in South Boston, “tonic” has traditionally been used. Outside these areas, non-alcoholic beverages are referred to as either “soda”, “pop”, or even “soda-pop”. “Soft drink” is the common, non-regional superordinate term. Regionalisms make translation more difficult than necessary and can even lead to misunderstandings or suggest region-specific features. Realizing which terms are regionalisms can be tricky, especially for external individuals. | |
Plural endings or variants in parentheses
Recommendation: Plural endings or variants should not be placed in parentheses.
Occasionally, plural endings or variants are placed in parentheses in specialized content to cover both singular and plural forms. Frequently, these endings and variants cannot be reproduced in foreign languages and pose a problem for automatic language processing.
EXAMPLE
Source language content | Target language content [it] |
(–) Remove the lid(s). | (–) Togliere il coperchio [i coperchi]. |
(+) Remove the lid. | (+) Togliere il coperchio. |
(+) Remove the lids. | (+) Togliere i coperchi. |
Explanation: The plural ending in parentheses cannot be reproduced in Italian. In addition, the translation can be ambiguous, if, for example, there are several lids. It can be unclear to users whether only one or more lids have to be removed, if the source and target language content allows more than one interpretation. | |
Non-specific verbs
Recommendation: Specific verbs should be used.
Using non-specific verbs (e.g. to provide, to implement, to put) in specialized content can cause inaccuracies and often lead to misinterpretation and confusion affecting comprehension, the translation process, and, consequently, the quality of the translation itself.
EXAMPLE
Source language content | Target language content [it] |
(–) If a statement doesn’t end with a semicolon, a compiler error happens. | (–) Se una riga non termina con un punto e virgola, si verifica un errore del compilatore. |
(+) If a statement doesn’t end with a semicolon, the compiler generates an error message. | (+) Se una riga non termina con un punto e virgola, il compilatore genera un messaggio di errore. |
Explanation: In a specialized context, it is important to be clear and precise. The verb “generate” conveys more information than “happen” because it signals the actor of the action, i.e. the “compiler”. Specific verbs have a precise meaning and leave no room for misinterpretation. In the end, the translation will also benefit from the choice of the right verb. | |
Light verb constructions
Recommendation: Light verb constructions should be avoided.
Light verb constructions, in which the meaning is primarily transferred from the verb to a noun, make statements unnecessarily complicated and hinder text comprehension. Light verb constructions are likely to be translated literally, which can result in translation errors. Instead, a single, strong verb which carries the meaning should be used, unless there is a rhetorical reason (e.g. theme-rheme-structure).
EXAMPLE
Source language content | Target language content [de] |
(–) At the conference, the president is giving a talk about new technologies. (+) At the conference, the president talks about new technologies. | (–) Bei der Konferenz gibt der Präsident einen Vortrag über die neuen Technologien. (+) Bei der Konferenz spricht der Präsident über die neuen Technologien. |
(–) The customer makes a claim for compensation. (+) The customer is filing a claim for compensation. | (–) Der Kunde macht einen Anspruch auf Schadensersatz. (+) Der Kunde beansprucht Schadensersatz. |
Explanation: In the negative examples, the light verb constructions were translated literally which is not correct in the German target examples. | |
Modal verbs
Recommendation: Modal verbs should only be used when necessary or prescribed by style guides.
In using modal verbs, due consideration should be given to the fact that not all languages have the same inventory of modal verbs. For example, not all nine modal verbs in English (can, could, may, might, shall, should, will, would, and must) have clear equivalents in other languages. The modal verb “should” should be avoided in specialized content, especially for instructions to act, and “can” should be avoided, especially for permissions, so that the statement will not be unambiguous.
EXAMPLE
Source language content | Target language content [fr] |
(–) You cannot use this saw to cut synthetic materials. | (1) Vous ne devez pas utiliser cette scie pour couper des matériaux synthétiques. (2) Vous ne pouvez pas utiliser cette scie pour couper des matériaux synthétiques. |
(+) You shall not use this saw to cut synthetic materials. | (+) Vous ne devez pas utiliser cette scie pour couper des matériaux synthétiques. |
Explanation: In the above example, the expression “you cannot” is often used when in fact “you shall not” is meant. “Can/cannot” express a capability, so “you cannot” actually means “you are not capable of” or “it is not possible for you to”. In some cases, it may not be clear which of the two options is meant if the modal verb “can” is used incorrectly. | |
4.3.4 Unambiguous references
Omitting pronouns with uncertain reference
Recommendation: Pronouns with uncertain reference should be avoided, especially across sentence boundaries.
In order to shorten texts and to avoid repetition, pronouns (e.g. “these”, “they”, “it”) are often used in specialized content as a substitute for their antecedents. Pronouns with unclear anaphoric references lead to misinterpretations (false antecedents) and cause difficulties in comprehension during the translation process, especially for machine translation.
EXAMPLE
Source language content | Target language content [de] |
(–) The rim is centered on the center hole, which must fit exactly on the hub of the vehicle. It must be cleaned and greased. | (1) Die Felge wird auf dem Mittelloch zentriert, das genau auf die Nabe des Fahrzeugs passen muss. Sie muss gereinigt und gefettet werden. (2) Die Felge wird auf dem Mittelloch zentriert, das genau auf die Nabe des Fahrzeugs passen muss. Es muss gereinigt und gefettet werden. |
(+) The rim is centered on the center hole, which must fit exactly on the hub of the vehicle. The center hole must be cleaned and greased. | (+) Die Felge wird auf dem Mittelloch zentriert, das genau auf die Nabe des Fahrzeugs passen muss. Das Mittelloch muss gereinigt und gefettet werden. |
Explanation: In the negative example, it is unclear whether the pronoun “it” in the beginning of the second sentence refers to the center hole or the hub. This can lead to translation errors. | |
Omitting function words
Recommendation: Wordings should be grammatically complete. Function words should not be omitted.
Function words, such as prepositions, conjunctions and articles, are an integral part of grammatical structures. They specify and create relations between other grammatical elements such as nouns used as subject or object. Omitting function words reduces clarity.
EXAMPLE 1
Source language content | Target language content [de] | |
(–) Update hardware device driver. | (1) Aktualisieren Sie den Treiber und das Hardwaregerät. (2) Aktualisieren Sie den Treiber des Hardwaregeräts. | |
(+) Update the driver of the hardware device. | (+) Aktualisieren Sie den Treiber des Hardwaregeräts. | |
Explanation: The grammatically incomplete sentence creates the following problems: | ||
| — It fails to state clearly the relationship between the two objects mentioned, i.e. whether two separate objects are referred to (“driver” and “hardware device”) or only one object (“hardware device driver”). — It lacks appropriate articles and the preposition “of”. | |
The grammatically complete sentence clarifies the objects and contains two definite articles. | ||
EXAMPLE 2
Source language content | Target language content [es] |
(–) Quality control new products assembly line | (1) Control de calidad de la cadena de montaje de productos nuevos (2) Control de calidad de productos nuevos en la cadena de montaje |
(+) Quality control of new products on the assembly line | (+) Control de calidad de productos nuevos en la cadena de montaje |
Explanation: The wording in the negative example is ambiguous due to the lack of clear indication as to whether “quality control” is referring to “new products assembly line” as a whole or just to “new products”. The wording “quality control” could either apply to the entire line dedicated to product assembly, or it could imply that the new products themselves (in the assembly line) are undergoing quality control. This ambiguity arises from the absence of prepositions or other contextual information that would distinguish between these interpretations. | |
Ambiguous genitive constructions
Recommendation: Ambiguous genitive constructions should be avoided.
EXAMPLE
Source language content | Target language content [it] |
(–) The acquisition of the company surprised the market. | (–) L’acquisizione dell’azienda ha sorpreso il mercato. |
(+) The acquisition made by the company surprised the market. | (+) L’acquisizione effettuata dall’azienda ha sorpreso il mercato. |
(+) The fact the company was acquired surprised the market. | (+) Il fatto che l’azienda sia stata acquisita ha sorpreso il mercato. |
Explanation: In the negative example, it is unclear whether “acquisition of the company” means that the company acquired something (variant 1) or was acquired (variant 2). This can lead to translation errors. | |
Ambiguous syntax
Recommendation: Ambiguous syntactic structures should be avoided.
In some languages, syntactic structures can be ambiguous and thus be prone to misunderstanding.
EXAMPLE
Source language content | Target language content [de] |
(–) To interrupt the circuit, cut the hose at the top. | (1) Um den Kreislauf zu unterbrechen, den Schlauch am oberen Ende abschneiden. (2) Um den Kreislauf zu unterbrechen, den oberen Schlauch abschneiden. |
(+) To interrupt the circuit, cut the top of the hose. | (+) Um den Kreislauf zu unterbrechen, den Schlauch am oberen Ende abschneiden. |
(+) To interrupt the circuit, cut the top hose. | (+) Um den Kreislauf zu unterbrechen, den oberen Schlauch abschneiden. |
Explanation: The prepositional phrase indicating the location can be read as relating to the same hose (upper part) or another hose (top hose). Thus, the sentence is ambiguous and can be understood in two ways. In order to avoid a translation with the wrong meaning, which – in the worst case – can lead to incorrect operations resulting in injuries, the location should be expressed in an unambiguous way, making clear to which part of the sentence the location information refers to. | |
4.3.5 Style and reader engagement
Requests for action
Recommendation: Any single sentence should contain only one request for action.
Where several requests for action are not closely connected to each other, they should be expressed in single sentences.
EXAMPLE
Source language content | Target language content [it] |
(–) Always wear appropriate personal protective equipment, ensuring that all electrical equipment is switched off when not in use, and reporting any potential hazards immediately. | (–) Indossare sempre un dispositivo di protezione individuale adeguato, assicurandosi che tutte le apparecchiature elettriche siano spente quando non sono in uso e segnalando immediatamente qualsiasi eventuale condizione di pericolo. |
(+) Always wear appropriate personal protective equipment. Ensure that all electrical equipment is switched off when not in use. Report any potential hazards immediately. | (+) Indossare sempre un dispositivo di protezione individuale adeguato. Assicurarsi che tutte le apparecchiature elettriche siano spente quando non sono in uso. Segnalare immediatamente qualsiasi eventuale condizione di pericolo. |
Explanation: The negative example shows one sentence that contains semantically independent sequences, i.e. they develop a complete discourse in themselves. Each sequence introduces one request for action. It would therefore be appropriate to split this one sentence into three independent and short sentences containing one request for action, as shown in the positive example. Assigning one request for action to each sentence facilitates understanding and translation. | |
Action steps
Recommendation: Individual action steps should be enumerated.
Where the sequence of the action steps is critical to the outcome of an action, individual action steps should be enumerated.
EXAMPLE
Source language content | Target language content [de] |
(–) To install the wooden plate on the metallic plate, first open the cover. Then place the wooden plate on the metallic plate. Fasten the wooden plate with the four screws and close the cover. | (–) Um die Holzplatte auf der Metallplatte zu befestigen, den Deckel öffnen. Dann die Holzplatte auf der Metallplatte platzieren. Die Holzplatte mit den vier Schrauben befestigen und den Deckel schließen. |
(+) To install the wooden plate on the metallic plate, proceed as follows: 1. Open the cover. 2. Place the wooden plate on the metallic plate. 3. Fasten the wooden plate with the four screws. 4. Close the cover. | (+) Um die Holzplatte auf der Metallplatte zu befestigen, wie folgt vorgehen: 1. Den Deckel öffnen. 2. Die Holzplatte auf der Metallplatte platzieren. 3. Die Holzplatte mit den vier Schrauben befestigen. 4. Den Deckel schließen. |
Explanation: In the negative example, the relevant actions steps are described one after another without any explicit structure. This gives the impression that the paragraph contains descriptive text rather than action steps. The positive example has a numbered list which provides a clear structure that helps the reader and the translator. It indicates that the product user is supposed to perform four actions in the given order. | |
Direct address in instructions
Recommendation: Actors should be directly addressed in instructions.
Where the actor in an instruction is important, the actor should be directly addressed.
EXAMPLE
Source language content | Target language content [de] |
(–) The user must close the cover. | (–) Die Benutzerin/der Benutzer muss den Deckel schließen. |
(+) Close the cover. | (+) Schließen Sie den Deckel. |
Explanation: In the negative example, “The user” seems to direct readers to a third person. So, readers of the source or target language content will likely not identify themselves as the persons that are addressed. In the positive example, the imperative form clearly informs readers that they are supposed to perform the relevant action. | |
Passive voice in instructions
Recommendation: Passive voice should be avoided in instructions.
In passive constructions, information about who is doing what or who is the addressee of an instruction is omitted. These constructions can make the instruction more difficult to understand and to translate correctly.
EXAMPLE
Source language content | Target language content [it] |
(–) The button must be pressed to turn off the lights. | (–) Per spegnere le luci va premuto il pulsante. |
(+) Press the button to turn off the lights. | (+) Premere il pulsante per spegnere le luci. |
Explanation: In the negative example, it is not clear who performs the action. The focus here is on the relationship between the verb “must be pressed” and the object “the button”. Instead, the positive example is more direct and precise: it addresses readers by explaining to them what to do. In this respect, the active voice identifies the action and the actor who is acting. It also requires fewer words and thus makes the sentence easier to read. | |
4.3.6 Gender-sensitive language
Use of gender-sensitive language in specialized content can be achieved in various ways.
The means chosen for gender-sensitive language can have various, specific meanings and at the same time be used for other linguistic applications or in other contexts. Hence, the chosen means should be used consistently.
Prior to commissioning a translation in which gender-sensitive language will be used (whether or not gender-sensitive language has been used in the source text) additional issues should be considered:
— consider that, in other languages and cultures, there will be differences in the use and feasibility of gender-sensitive language;
— communicate whether gender-sensitive language has been used in the source text and by what linguistic means;
— indicate in the translation project specifications whether gender-sensitive language should also be used in the target language(s). Gender-sensitive guidelines can be agreed on with the translation service provider;
— verify whether terminology recognition requires that terms in the termbase will have to be adapted to the linguistic means used in the gender-sensitive text;
— verify whether existing segments in the TM will have to be revised to achieve compliance with the linguistic means used in the gender-sensitive text;
— where machine translation systems are used, consider that the linguistic means chosen to achieve gender-sensitive language may not be correctly recognised and translated.
4.4 Presentation of content
4.4.1 Culture
Country-specific and culture-specific elements should be avoided to the greatest extent possible in translation-oriented text production. Rather, content should be produced and presented in such a way that it can be understood internationally.
Images, illustrations and symbols
Recommendations:
— Culturally neutral elements should be used.
— Elements with a linguistic reference should be avoided.
Non-linguistic elements such as images, illustrations and symbols can have positive or negative connotations depending on the language environment and culture. Hence, culturally neutral elements should always be used in texts, i.e. elements that will be acceptable in all or most cultures.
EXAMPLE 1
Source language content | Target language content |
(1) approval/selection (2) rejection/deselection | |
(1) approval (2) gesture for the number 1 (3) pejorative meaning | |
Explanation: In some countries, it is common to use an “x” as a symbol for ticking an option, while in other countries, the “x” is interpreted rather as a rejection of an option and the “checkmark” is used instead. The “thumbs up” symbol, which is also used in many online meetings as a sign of approval, enjoys less positive interpretations in some countries. | |
The clothing of characters, the environment, objects, gender relations and implicit messaging in an image element will all play a role in how culturally neutral an image, an illustration or a symbol is. Hence, metaphorical elements (e.g. images of currency symbols) or gestural illustrations and symbols should be avoided.
EXAMPLE 2
Source language content | Target language content [de] | |
Explanation: Example 2 uses images taken from Graphic User Interfaces (GUI) to illustrate three phenomena that should be avoided when developing content intended for international use: | ||
| — First, there is a linguistic element (“Home”) within the button that is very difficult to localise and replace with the German translation. The whole image will have to be replaced or recreated. — Second, the symbol of a house (for “Home”) was used for illustrative purposes and to render the purpose of the button readily recognisable, but this symbol fails to make any sense in the German version, as it is termed the “Startseite” (starting page). — Third, the symbol depicts a typical house in European/North American countries; in other countries, houses look quite different. | |
EXAMPLE 3
Source language content | Target language content [fr] |
Explanation: Example 3 shows a similar problem: the symbol corresponds to the English-language designation of the button (clipboard), but not to the French-language designation “Zwischenablage” (literally: “interim storage or archive”). | |
Units of measurement and statements of quantities
Recommendations:
— Units of measurement should always be used consistently.
— Culture-specific statements of quantity should be avoided.
Units of measurement and statements of quantities are not identical in all countries and cultures. Hence, translation-oriented writing requires that due care be applied to these units.
Quantities should be given in international units of measurement. Where possible, the international units of measurement standardized in ISO 80000 (all parts) should be used. In cases where avoiding culture-specific statements of quantity and using international units of measurements is impossible or unwanted, the translation project specifications should contain an agreement whether the units shall be replaced or supplemented, if the source-language units are uncommon in the target language.
EXAMPLE
Source language content | Target language content [zh] |
(–) 1 bottle of milk | (–) 1瓶牛奶 |
(+) 1 litre of milk | (+) 1升牛奶 |
Explanation: In some countries, milk is not sold in bottles of uniform size or not sold in bottles. | |
Dates
Recommendation: Short format dates should be avoided.
Language gap ambiguities can lead to misunderstandings which can lead to confusion using CAT tools and MT. Short format dates should be avoided and unambiguous formats should be used.
EXAMPLE
Source language content | Target language content [de] |
(–) 06/07/2022 | (–) 06. Juli 2022 |
(+) 07 June 2022 | (+) 07. Juni 2022 |
Explanation: DD.MM.YYYY, in German, for instance, would properly be MM/DD/YYYY in English. The English short format dates can therefore be mistranslated as in the negative example. For details on dates and time, see ISO 8601. | |
References to culture-specific entities
Recommendation: References to culture-specific entities that are related to the source language should be avoided.
Source texts can contain culture-specific entities such as examples, circumstances or facts in relation to country-specific foods, sports, holidays, laws, standards, directives, contact information and institutions which are unknown or irrelevant in the target culture.
Culture-specific entities should be referred to only when this is necessary for expressing the intended message. In such cases, the translation project specifications should indicate how the references are to be treated in the target language content.
EXAMPLE 1
Source language content | Target language content [de] |
(–) This proposal is not worth a penny. | (–) Dieser Vorschlag ist keinen Pfennig wert. |
(+) This proposal is not worth anything. | (+) Dieser Vorschlag ist nichts wert. |
Explanation: In the negative example, the culture-specific currency unit “penny” is mentioned. However, the message of the sentence does not need such a reference. Also, such a reference could prompt the translator to use a culture-specific equivalent in the target language. Again, such reference is not needed there, either. In the positive example, the same message is expressed without referring to any culture-specific entity, both in the source language content and in the target language content. | |
EXAMPLE 2
Source language content | Target language content [de] |
(–) The Fed and Te Pūtea Matua are planning to lower interest rates. | (–) Die Fed und Te Pūtea Matua planen Zinssenkungen. |
(+) The central banks of the United States and of New Zealand are planning to lower interest rates. | (+) Die Zentralbanken der Vereinigten Staaten und Neuseelands planen Zinssenkungen. |
Explanation: In the negative example, the culture-specific proper names of two central banks are mentioned. However, the message of the sentence does not need such a reference. Also, such a reference could prompt the translator to use unmodified culture-specific proper names as they are in the source language. Again, such reference is not needed there, either. In the positive example, the same message is expressed without using culture-specific proper names, both in the source language content and in the target language content. | |
Language identifiers
Recommendation: For indicating languages, standardized language identifiers according to ISO 639 should be used.
When content is delivered in more than one language, recognizing the current language is paramount for the target audience. Indicating languages clearly avoids misunderstandings.
EXAMPLE
Source language content | Target language content [de] |
(–) [GB] | (–) [AT] |
(+) [en] | (+) [de] |
Explanation: In the first negative example, country flags are used to represent languages. In the second negative example, country codes in uppercase according to ISO 3166 are used to represent languages. However, country and language are separate categories that clearly differ from another. In addition, as some countries have more than one official language, it is not possible to equate the country code with the language. In the positive example, language identifiers in lowercase according to ISO 639 are used. They represent languages correctly and neutrally. | |
Country codes
Recommendation: For indicating countries, standardized country codes according to ISO 3166 should be used.
When content is delivered to more than one country, recognizing the current country is paramount for the target audience. Indicating countries clearly avoids misunderstandings.
EXAMPLE
Source language content | Target language content [fr, nl] |
(–) [fr], [nl] | (–) [fr], [nl] |
(+) [LU], [NL] | (+) [LU], [NL] |
Explanation: In the first negative example, country flags are used to represent countries. In the second negative example, language identifiers in lowercase according to ISO 639 are used to represent countries. However, some country flags are very similar to each other. In addition, language and country are separate categories that differ from each other. As some languages are the official language of more than one country, it is not possible to equate the language identifier with the country. In the positive example, country codes in uppercase according to ISO 3166 are used. They represent countries correctly and neutrally. | |
4.4.2 Logic
Recommendation: Source language content should be written in such a way as to ensure that their logical structure can be recognised and understood both clearly and fully.
The sequence of the content, actions or internal logic is a basic prerequisite for reader-oriented text production, because especially those involved in the translation process rely heavily on comprehensibility and logical consistency. Poorly structured source texts are sometimes not recognised until they pose problems during translation, which results in additional time and effort to correct.
EXAMPLE
Source language content | Target language content [de] |
(–) Before the software can start, the settings must be adjusted. | (–) Bevor das Programm starten kann, müssen die Einstellungen angepasst worden sein. |
(+) The settings must be adjusted before the software can start. | (+) Passen Sie die Einstellungen an, bevor Sie das Programm starten. |
Explanation: In the negative example, the sentence begins with what happens second in the sequence of action and logic. The positive example is much easier to understand, because the content follows the logical sequence. | |
5.0 Recommendations for the handover from text production to translation
5.1 General
In addition to the recommendations governing formatting, terminology, grammar, syntax, semantics and cultural aspects, other elements will also impact whether specialized content is suitable for translation.
5.1.1 File formats
Recommendations:
— Source texts should be provided in an editable file format.
— All embedded objects should be provided in their original file format.
— Files should not be password-protected.
— Source texts should not be translated in stand-alone text editors or content management systems. They should be processed using CAT tools.
Source language content submitted exclusively in PDF format or in the form of graphic files (e.g. *.jpeg, *.tiff) does not constitute appropriate translation-oriented content; conversion and associated post-processing will entail additional time and effort.
TM systems require the conversion of non-editable or uncommon native editable formats into editable files. If this step is necessary, translation service providers shall consult with clients during the pre-translation phase in order to agree upon the additional costs involved. However, translation without a TM system should be avoided, because it will be impossible to integrate TMs and termbases into the translation process. The same concerns also apply to embedded objects in an otherwise editable file format.
Furthermore, during the pre-translation phase, some translated texts must be edited or laid out post translation in the source format or in the original software. Copying and pasting text from a file that the TM system cannot process into an intermediate file, for instance, a word-processing format, and then back again is very error-prone. Here, the client and the translation service provider should agree on a suitable exchange format or on conversion options for translation when using TM systems.
If the file to be translated is password-protected, it is generally impossible to create a target file and the source text cannot be overwritten with the target text.
EXAMPLE 1
Source language format | Target language format |
(–) PDF file | (–) recreation of content in text editor |
(+) editable native file format or exchange format | (+) target language content automatically replaced in editable native file format or exchange format |
Explanation: If the client provides the translation service provider with a PDF from a layout software instead of an exchange file format, the translation service provider will have to recreate the file in a text editor available to the translation service provider in order to translate the file using CAT tools. The target language content will then be delivered to the customer in a file format that the customer cannot continue to work with and maybe the client will even have to copy and paste the content into the source format. | |
EXAMPLE 2
Source language format | Target language format |
(–) file from text editor with texts intended for translation in an embedded, non-editable image | (–) transcript of the text, text boxes or legends below the images, complete recreation of the images with text including target language content |
(+) legend below the non-editable graphic, editable text boxes, or graphic in its native, editable format | (+) target language content automatically replaced in legend below the non-editable graphic, editable text boxes, or graphic in its native, editable format |
Explanation: If the client provides the translation service provider with an editable file that contains text in non-editable images, the translation service provider will have to transcribe the text, create text boxes or create legends below the images in order to be able to localize the texts within the images as well. | |
EXAMPLE 3
Source language format | Target language format |
(–) Spreadsheet in the presentation software cannot be edited, because embedded or linked spreadsheet file is unavailable. | (–) transcript of the spreadsheet, text boxes or legends below the images, complete recreation of the spreadsheet with text including target language content |
(+) The embedded spreadsheet file is provided and can be edited. | (+) target language content automatically replaced in embedded spreadsheet |
Explanation: If the client provides the translation service provider with the spreadsheet or other embedded files, the translation service provider will not have to transcribe the text, create text boxes or create legends below the embedded files in order to be able to localize the texts within the embedded files as well. | |
EXAMPLE 4
Source language format | Target language format |
(–) link to a website intended for translation or texts copied from the website into a word-processing format | (–) word processing format |
(+) export of website content from the content management system to *.xml, *.html, *.xliff, etc. | (+) target language content automatically replaced in *.xml, *.html, *.xliff etc. |
Explanation: If the client wants the translation to be performed in the content management system, it will be difficult for the translation service provider to provide a quote, TM matches cannot be reused and software-based in-process quality checks including spell checks or terminology checks are often impossible. If the client provides the translation service provider with website content in word processing formats, the target language content cannot be re-imported into the content management system after translation. Most content management systems can export and import content. This function should be used for website translations as well, because it saves the client a lot of time and copying and pasting text is very error-prone. | |
5.1.2 Layout
Recommendations:
— During the translation process, sufficient space should be reserved for laying out the translated text and images of the source text to allow for text expansion.
— PDF file displaying the full layout and all graphics should be provided.
The same content in English, for example, will generally retract in length when compared to the same content in Spanish or French.
Space reserves will be of particular importance, where a true-to-page translation has been agreed.
In cases where the layout of the text and images in the source text cannot be displayed visually in the editable source format, and the translation service provider cannot see the full layout, a file in PDF format should be provided for reference purposes in addition to the file in editable format.
If an exchange format is provided for translation, a PDF file illustrating the source text layout and all images shall also be provided for reference purposes. This is especially important in cases where the translation project specifications include the final, correct layout.
EXAMPLE
Source language format | Target language format |
(–) text fragments from the instruction manual taken from the content management systems in exchange format (e.g. *.xml) |
text fragments from the instruction manual taken from the content management systems in exchange format (e.g. *.xml) |
(+) text fragments from the instruction manual taken from the content management systems in exchange format (e.g. *.xml) along with source language PDF of the full instruction manual including images | |
Explanation: If the client provides the translation service provider with a PDF of the source file to be translated, the translator has access to context information, images and layout information (is the text to be translated a header or an instruction?) etc. while translating the text fragments contained in the *.xml file. This will help to avoid translation errors. | |
5.1.3 Contextual information, reference material and terminology aids
Recommendations:
— Relevant contextual information and referenced content should be provided together with the source text.
— Pre-existing source and target language content should be provided.
— Where possible, reference material should be provided in the source and target languages.
— The amount of reference material provided and utilized should be reasonably proportionate to the scope of the translation project.
— Terminology and contextual information should always be provided in separate files or documents and should not be incorporated into the source language content.
In many cases, the translation service provider is supplied not only with the source text, but also with additional material (e.g. extracts from termbases, translation memories) and information (e.g. translation guidelines). Additional material and information can be especially relevant in cases in which the translation service provider is tasked with fulfilling or adhering to specifications that cannot be derived from the source text.
In situations where the context of the source language content does not emerge sufficiently from the source text itself, appropriate contextual information should be provided by the client, together with the source text. The required contextual information can relate both to the content itself and to the visual presentation of the source language content; this information can also be conveyed through briefings or product training.
In situations where the source language content is part of more extensive content, parts of which have already been translated into the target language, the pre-existing source and target language content should be provided by the client to the translation service provider as reference material. Existing source and target language content can also be provided as a TM.
In situations where the source language content contains references to other content, the client should provide such other content to the translation service provider as reference material, if the translation service provider requests it.
Terminology and contextual information should always be provided in separate files or documents and should not be incorporated into the source language content – for instance, in the form of notes placed in parentheses or brackets.
EXAMPLE
Source language format | Target language format |
(–) Notes for the translators are incorporated in the source language content, target language terminology is marked in the text in brackets or comments for the translators to use. | (–) Target language content still contains all notes and comments (hidden or ignored during translation). |
(+) Notes for the translators are provided in the order e-mail or as separate files or documents. (+) Terminology should be provided as glossaries in separate files. | (+) Target language content contains no notes and comments. |
Explanation: If the client provides the translation service provider with a source language content that contains remarks, terminology suggestions or instructions for the translator in the form of highlighted text, notes placed in parentheses or brackets within the text or comments, the translation service provider will have to hide or delete those notes in order to truthfully count the number of words, lines, etc. to be translated. Also, some of the notes can be mistaken for text to be translated, and will have to be removed before the final translation can be produced based on the processed file. | |
5.1.4 Locales, document templates and styles
Recommendation: Specifications for the target language locales should be provided.
To the extent possible, the client and the translation service provider should reach an agreement on which locales shall be used for the target language content and which document templates and styles shall be adapted for the target language.
5.1.5 Lists, indices and glossaries
Recommendation: Specifications for entries in lists, indices or glossaries should be provided.
In situations where the source text contains lists, such as tables of content or figures, indices or glossaries, the client and the translation service provider should agree on clear specifications for the creation of corresponding entries in lists, indices or glossaries.
In these specifications, due consideration should be given not only to the needs and expectations of the addressees of the target text, but also to target-language conventions.
EXAMPLE
Source language format | Target language format |
(–) Source language content contains an index, but no instructions regarding the localisation of index entries. | (–) Index entries in target language content are inconsistent, some begin with capital letter, some with a lower-case letter, some begin with the noun, some with the adjectives etc. |
(+) Source language content contains index, target language index entries are to be translated beginning with a capital letter, the noun is to come first, adjectives are to follow after a comma. | (+) Index entries in target language content are consistent, all start with a capital letter, and the noun comes first. |
Explanation: If the client follows certain rules when creating, for example, an index in the target language, the same index will likely not fulfil its intended purpose if the translators do not follow the same or any rules, especially if the target language index consists of translations from several projects over time. | |
5.1.6 Reviewing source language content prior to providing it to the translation service provider
Recommendation: Source language content should be reviewed.
Prior to providing the source language content to the translation service provider, a careful review should be undertaken regarding factual accuracy and comprehensibility as well as correctness of terminology used, sentence structure, spelling and punctuation.
6.0 Translation-oriented texts: Evaluation
6.1 General
Most evaluation processes are based on pre-defined criteria, compliance with which is reviewed and used as the basis for subsequent evaluation. These criteria are frequently divided into categories in order, on the one hand, to make the evaluation results as transparent and informative as possible and, on the other, to provide indications for targeted optimisation. Categorising the criteria can help to bring into focus determinate problem areas. Weighting the criteria, when combined with categorisation, can provide information about the quality of a text.
For target language content, a number of criteria and methods exist that can be used to assess the quality of translation output, such as ISO 5060, SAE J2450 Translation Quality Metric, Data Quality Framework (DQF) by TAUS, Multidimensional Quality Metrics (MQM)[1]. Yet, to date, no evaluation models exist to assess the quality of source texts in view of translatability.
An evaluation broken down into and summarised as categories can also be used to make a substantiated determination about the extent to which a source text constitutes a translation-oriented source text. Through the combination of cluster evaluation and the weighting of individual categories or subcategories, an evaluation can be carried out that does justice to the specific application case. For example, the intended type of translation (human translation without CAT tool, translation in integrated translation environments (CAT tools) or standalone machine translation), the properties of the tools to be used, the text type (long, descriptive texts, parts lists, marketing texts) and other parameters can be taken into account and adequately evaluated.
6.1.1 Criteria
The evaluation of any source text should check for compliance with recommendations specified in Clause 4. To this end, categories, with subcategories, if necessary, can be defined. For the present document, this specifically means that the following criteria should be incorporated into the evaluation:
— Does the source text contain formatting that would hinder or interfere with the translation?
— Text flow and segmentation
— Graphic layout
— Text content and references
— Has the terminology been used correctly and consistently in the source text?
— Use of synonyms
— Use of abbreviations
— Occurrence of orthographic variants
— Occurrence of ambiguities
— Are the linguistic elements used in the source text comprehensible and have they been used uniformly?
— Sentence structure
— Word choice and word formation
— Unambiguous references
— Style and reader engagement
— Is the content in the source text presented in an internationally understandable and logically comprehensible way?
— Country-specific and culture-specific elements
— Sequence of the content, actions or internal logic
Superordinate or subordinate criteria can be weighted differently, depending on the application scenario, in order to influence the overall evaluation result.
6.1.2 Evaluation methods
There are two ways of evaluating whether texts have been written in a translation-oriented manner:
— manually (e.g. using the lists in Annex A and Annex B);
— tool-based (e.g. using a language checker).
Source language content can be evaluated by language checkers regarding the criteria listed in Table A.1. Language checkers can be based on linguistic methods, statistical methods or large language models (LLMs)/AI. Especially language checkers based on linguistic methods can consistently check for criteria such as grammatical and stylistic issues, ambiguities on word and sentence level, terminological variants, and unclear relations between words or part of sentences.
When language checkers are used, a human editor should be in charge of the final decision on how to adapt the source language content.
(informative)
List of evaluation criteria
The list in Table A.1 itemises the criteria in 6.2 that are based on the recommendations in Clause 4 (Recommendations for specialized content intended for translation).
The table contains information whether a criterion causes issues for the different translation processes:
— Issue for HT without CAT tool: issue for translation process performed by humans without CAT tools
— Issue for CAT: issue for translation process in integrated translation environments (CAT tools), mainly segmentation and term recognition
— Issue for standalone MT: issue for translation process involving standalone machine translation
The columns indicate the effect of the criterion on the translatability of the source language content, depending on the process and technology employed:
— NO (white): The criterion does not pose any issues for this translation process
— YES (dark grey): The criterion poses an issue for this translation process
— MAYBE (light grey): In some cases, the criterion can pose an issue for this translation process
Table A.1 — List of evaluation criteria
Criterion | Issue for HT without | Issue for CAT | Issue for standalone MT |
|---|---|---|---|
FORMATTING (4.2) | |||
FORMATTING IN TERMS OF TEXT FLOW AND SEGMENTATION (4.2.2) | |||
The source language content contains hard breaks within sentences | NO | YES | YES |
The source language content contains soft breaks in the middle of sentences or at the end of a sentence | NO | YES | MAYBE |
The source language content contains manual hyphenation | NO | YES | MAYBE |
The source language content begins before and continues after a list | NO | YES | YES |
The source language content contains tabs within sentences | NO | YES | MAYBE |
The source language content contains several blank spaces in a row within sentences | NO | YES | MAYBE |
The source language content contains abbreviations using full stops | NO | YES | MAYBE |
FORMATTING GRAPHIC LAYOUT OF TEXTS (4.2.3) | |||
Symbols have been inserted over several spaces or tabs | NO | YES | YES |
Content has been indented using several spaces or tabs | NO | MAYBE | MAYBE |
FORMATTING TEXT CONTENT AND REFERENCES (4.2.4) | |||
The source language content contains passages in several languages | NO | YES | YES |
Cross-references have been inserted manually | YES | YES | YES |
TERMINOLOGY (4.3) | |||
The source language content contains synonyms | YES | YES | YES |
The source language content contains uncommon or unknown abbreviations | YES | YES | YES |
The source language content contains orthographic variants | NO | YES | YES |
The source language content contains homonyms or polysemes or other ambiguities | YES | YES | YES |
Relationships in compound terms and constructions are ambiguous and not transparent | YES | YES | YES |
GRAMMAR, SYNTAX AND STYLE (4.4) | |||
SENTENCE STRUCTURE (4.4.2) | |||
Sentence length exceeds appropriate threshold | YES | YES | YES |
Main clauses are interrupted by subordinate clauses | YES | YES | YES |
The verb is located too far away from the subject | MAYBE | YES | YES |
Sentence fragments without verbs have been used | MAYBE | YES | YES |
WORD CHOICE AND WORD FORMATION (4.4.3) | |||
Abbreviations that are ambiguous or that have not been nationally or internationally standardized have been used | YES | YES | YES |
Non-specific terms have been used | YES | YES | YES |
Regionalisms have been used | YES | YES | YES |
Plural endings or variants in parentheses have been used | NO | YES | YES |
Non-specific verbs have been used | YES | YES | YES |
Light verb constructions have been used | NO | YES | YES |
Unnecessary modal verbs have been used | YES | YES | YES |
UNAMBIGUOUS REFERENCES (4.4.4) | |||
Pronouns with uncertain reference have been used | YES | YES | YES |
Function words have been omitted | YES | YES | YES |
Ambiguous genitive constructions have been used | YES | YES | YES |
Ambiguous syntactic structures have been used | YES | YES | YES |
READER ADDRESS AND STYLE (4.4.5) | |||
A single sentence contains multiple requests for action | YES | YES | YES |
Several action steps are not enumerated | YES | YES | YES |
The actor in instructions is not directly addressed | YES | YES | YES |
Passive voice has been used in instructions | NO | YES | YES |
GENDER SENSITIVE LANGUAGE (4.4.5) | |||
Gender-sensitive language has been used | YES | YES | YES |
PRESENTATION OF CONTENT (4.5) | |||
CULTURE (4.5.1) | |||
The source language content contains culture-specific elements or elements with a linguistic reference | YES | MAYBE | MAYBE |
The source language content contains culture specific units of measurement or statements of quantities | YES | YES | YES |
The source language content contains short format dates | YES | YES | YES |
The source language content contains references to culture-specific entities | YES | YES | YES |
The source language content contains country codes instead of language identifiers | YES | YES | YES |
The source language content contains language identifiers instead of country codes | YES | YES | YES |
LOGIC (4.5.2) | |||
The source text contains factually illogical statements or illogical statements of action | YES | YES | YES |
(informative)
Checklist of recommendations
Authors and editors can use the checklist of all recommendations given in Table B.1 as a reminder when developing or checking source language content intended for translation and refer to the cross-referenced clauses for more information and examples.
Table B.1 — Checklist of recommendations
Topic | Recommendations | Check | See |
Formatting in terms of text flow and segmentation | Hard breaks should not be manually inserted within sentences. | ☐ | 4.2.2.1 |
Soft breaks should be avoided in the middle of a sentence if they serve no purpose other than for layout. | ☐ | 4.2.2.2 | |
Soft breaks should be avoided at the end of a sentence. | ☐ | ||
Manually inserting hyphens in order to influence layout should be avoided. Instead, the automatic hyphenation feature should be used. | ☐ | 4.2.2.3 | |
Non-breaking hyphens should be used instead of breaks in order to prevent an unwanted line break after a hyphen. | ☐ | ||
Lists in the middle of a sentence should be avoided. | ☐ | 4.2.2.4 | |
Tabs should be avoided within sentences or lines of text when inserted for layout purposes. Instead, built-in text editor formatting features should be used. | ☐ | 4.2.2.5 | |
Spaces should not be inserted for layout purposes. Instead, built-in text editor formatting features should be used. | ☐ | 4.2.2.6 | |
Non-breaking spaces should be used instead of normal spaces in order to prevent unwanted line breaks after a space. | ☐ | ||
Additional spaces should be avoided following full stops (within acronyms and initialisms or dates). | ☐ | ||
Abbreviations using full stops should be avoided. | ☐ | 0 | |
Formatting graphic layout of texts | Symbols should be inserted in line with the text, they should hover over several spaces or tabs. | ☐ | 4.2.3.2 |
Text should not be indented using tabs or spaces. Instead, built-in text editor features should be used. | ☐ | 4.2.3.3 | |
Fixed row heights should be avoided. | ☐ | 4.2.3.4 | |
Text boxes should be configured to resize automatically to adjust to the length of the text. | ☐ | 4.2.3.5 | |
International character sets should be used that support all target languages. | ☐ | 4.2.3.6 | |
The operating system should be set to accommodate the appropriate font to display the character set. | ☐ | ||
Formatting text content and references | Multilingual source texts should be avoided. | ☐ | 4.2.4.1 |
Cross-references should not be inserted manually. Instead, the built-in text editor formatting features should be used. | ☐ | 4.2.4.2 | |
Terminology | Synonyms should be avoided. | ☐ | 4.3.2 |
Uncommon or unknown abbreviations should be avoided. | ☐ | 4.3.3 | |
Orthographic variants should be avoided | ☐ | 4.3.4 | |
Potential ambiguities caused by polysemes or homonyms should be clarified by indicating the relevant domain, describing the underlying concept or providing an appropriate comment. | ☐ | 4.3.5 | |
The relationships in compound terms and constructions should be transparent and unambiguous. | ☐ | 4.3.6 | |
Termbases containing polysemes or homonyms should provide relevant contextual information to support translators in distinguishing which equivalent to use. | ☐ | 4.3.7 | |
Grammar, syntax and style | Sentences should be as complete as necessary and as concise as possible. | ☐ | 4.4.2.1 |
Main clauses should not be interrupted by subordinate clauses. | ☐ | 4.4.2.2 | |
The verb should be as close to the subject as possible. | ☐ | 4.4.2.3 | |
Sentence fragments without verbs should be avoided. | ☐ | 4.4.2.4 | |
Lists should always be introduced with a clear introductory sentence. They should never be positioned in the middle of a sentence. | ☐ | 4.4.2.5 | |
Word choice and word formation | Abbreviations that are ambiguous or that have not been nationally or internationally standardized should be avoided. | ☐ | 4.4.3.1 |
Specific terms should be used. | ☐ | 4.4.3.2 | |
Regionalisms should only be used when essential for conveying the intended message, and translation project specifications should outline how these references should be handled in the target language content. | ☐ | 4.4.3.3 | |
Plural endings or variants should not be placed in parentheses. | ☐ | 4.4.3.4 | |
Specific verbs should be used. | ☐ | 4.4.3.5 | |
Light verb constructions should be avoided. | ☐ | 4.4.3.6 | |
Modal verbs should only be used when necessary or prescribed by style guides. | ☐ | 4.4.3.7 | |
Unambiguous references | Pronouns with uncertain reference should be avoided, especially across sentence boundaries. | ☐ | 4.4.4.1 |
Wordings should be grammatically complete. Function words should not be omitted. | ☐ | 4.4.4.2 | |
Ambiguous genitive constructions should be avoided. | ☐ | 4.4.4.3 | |
Ambiguous syntactic structures should be avoided. | ☐ | 4.4.4.4 | |
Style and reader engagement | Any single sentence should contain only one request for action. | ☐ | 4.4.5.1 |
Individual action steps should be enumerated. | ☐ | 4.4.5.2 | |
Actors should be directly addressed in instructions | ☐ | 4.4.5.3 | |
Passive voice should be avoided in instructions | ☐ | 4.4.5.4 | |
Gender-sensitive language | Relevant information on the use of gender-sensitive language in the source language content and the specifications for the use of gender-sensitive language in the target language content should be provided. | ☐ | 4.4.6 |
Culture | Culturally neutral elements should be used. | ☐ | 4.5.1.1 |
Elements with a linguistic reference should be avoided. | ☐ | ||
Units of measurement should always be used consistently. | ☐ | 4.5.1.2 | |
Culture-specific statements of quantity should be avoided. | ☐ | ||
Short format dates should be avoided. | ☐ | 4.5.1.3 | |
References to culture-specific entities that are related to the source language should be avoided. | ☐ | 4.5.1.4 | |
For indicating languages, standardized language identifiers according to ISO 639 should be used. | ☐ | 4.5.1.5 | |
For indicating countries, standardized country codes according to ISO 3166 should be used. | ☐ | 4.5.1.6 | |
Logic | Source language content should be written in such a way as to ensure that their logical structure can be recognised and understood both clearly and fully. | ☐ | 4.5.2 |
Handover from text production to translation | Source texts should be provided in an editable file format. | ☐ | 5.2 |
All embedded objects should be provided in their original file format. | ☐ | ||
Files should not be password-protected. | ☐ | ||
Source texts should not be translated in stand-alone text editors or content management systems. They should be processed using CAT tools. | ☐ | ||
During the translation process, sufficient space should be reserved for laying out the translated text and images of the source text to allow for text expansion. | ☐ | 5.3 | |
PDF file displaying the full layout and all graphics should be provided. | ☐ | ||
Relevant contextual information and referenced content should be provided together with the source text. | ☐ | 5.4 | |
Pre-existing source and target language content should be provided. | ☐ | ||
Where possible, reference material should be provided in the source and target languages. | ☐ | ||
The amount of reference material provided and utilized should be reasonably proportionate to the scope of the translation project. | ☐ | ||
Terminology and contextual information should always be provided in separate files or documents and should not be incorporated into the source language content. | ☐ | ||
Specifications for the target language locale should be provided. | ☐ | 5.5 | |
Specifications for entries in lists, indices or glossaries should be provided. | ☐ | 5.6 | |
Source language content should be reviewed. | ☐ | 5.7 |
Bibliography
[1] ISO 704:2022‑7, Terminology work — Principles and methods
[2] ISO/IEC/IEEE 82079‑1:2019, Preparation of information for use (instructions for use) of products — Part 1: Principles and general requirements (IEC/IEEE 82079‑1)
[3] ISO 17100:2015, Translation services — Requirements for translation services
[4] ISO 80000 (all parts), — Quantities and units
[5] ISO 20539:2023, Translation, interpreting and related technology — Vocabulary
[6] ISO 5060:2024, Translation services — Evaluation of translation output — General guidance
[7] ISO 11669:2024, Translation projects — General guidance
[8] ISO 26162‑3:2023, Management of terminology resources — Terminology databases — Part 3: Content
[9] ISO 639:2023, Code for individual languages and language groups
[10] ISO 21720:2024, XLIFF (XML Localization Interchange File Format)
[11] ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
[12] SAE J2450, Translation Quality Metric
[13] ISO 8601:2019, (all parts), Date and time
[14] ISO 24183:2024, Technical communication — Vocabulary
The examples given are provided for informational purposes only and do not imply that ISO endorses them. ↑
