ISO/DIS 1951:2025(en)
ISO/TC 37/SC 2
Secretariat: SCC
Date: 2025-03-28
Presentation of lexicographic entries in general language dictionaries – Fundamentals and recommendations
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: +41 22 749 01 11
Email: copyright@iso.org
Website: www.iso.org
Published in Switzerland
Contents
4 An overview of lexicographic components 5
5 Typographical conventions in printed and digital dictionaries 9
Annex A (informative) Structure of a lexicographic entry 12
Annex B (informative) Lexicographic symbols in printed and digital dictionaries 15
Annex C (informative) Dictionary examples applying LMF modelling mechanisms 19
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
ISO draws attention to the possibility that the implementation of this document may involve the use of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent rights in respect thereof. As of the date of publication of this document, ISO had not received notice of (a) patent(s) which may be required to implement this document. However, implementers are cautioned that this may not represent the latest information, which may be obtained from the patent database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.
Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISO's adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee SC 2, Terminology workflow and language coding.
This fourth edition cancels and replaces the third edition (ISO 1951:2007), which has been technically revised.
The main changes are as follows:
— extending the scope;
— reviewing the entire content;
— changing the title, retaining the term ‘presentation’ because it is a fundamental aspect of this standard, while the term ‘representation’ has been removed and is now referring to the ISO 24613 series available on the ISO website;
— introducing the relationship between the generic structure and the presentation of lexicographic entries, using the LMF (Lexical Markup Framework) TEI serialization and integrating the TEI tagset as the reference for implementing the proposed model;
— reviewing and updating core lexicographic terms to align with the current state of the field, as well as introducing new terms.
Any feedback or questions on this document should be directed to the user’s national standards body. A complete listing of these bodies can be found at www.iso.org/members.html.
The lexicographic landscape has undergone a profound transformation over the last few decades, primarily due to the definitive shift to digital platforms. Technological advances have played a pivotal role in shaping new strategies and directions: a significant number of lexicographic resources are currently accessible online, largely due to retro-digitization; the limitations imposed by print editions are no longer a concern; the integration of corpora has evolved into a widely recognized best practice; various dictionary writing systems have been developed to accommodate the changing landscape; and annotation schemes have markedly improved. In this digital age, the ongoing revolution demands the application of adapted standards and tools to ensure the availability of structured data and promote interoperability between systems, especially given the inherent heterogeneity in the dictionary-making process due to variations in nature, form, and content.
This revised document arose from the work within ISO working group ISO/TC 37/SC 2/WG 9, Terminology workflow and language coding. It aligns with ISO international standards ISO 24613-1:2024, ISO 24613-2:2020, ISO 24613-3:2021 and ISO 24613-4:2021 developed by ISO working group ISO/TC 37/SC 4/WG 4, focusing on modelling data representation in a variety of dictionary subtypes.
The intended audience for this document includes lexicographers as well as researchers and practitioners in the field of language resource management who work with lexicographic resources.
This document adopts a lexicographic lemma-oriented approach and focuses on general language dictionaries, whether monolingual, bilingual, or multilingual, which serve as valuable tools and references for broadening knowledge. Regarding representing lexicographic data, the relationship between the generic structure and the presentation of lexicographic entries is elucidated using LMF TEI serialization, integrating the TEI tagset as the reference for implementing the proposed model.
To develop a standard that establishes the model for the presentation of lexicographic entries in general language dictionaries, this document aims to 1) provide recommendations for addressing the variety of existing heterogeneous features and practices found in human-readable dictionaries, whether in print or digital format; 2) standardize the core concepts related to the presentation of various components in a lexicographic entry, as uniformity of terminology promotes consistency and data reusability; 3) reproduce the typographical conventions described in previous editions of ISO 1951.
This document includes examples from printed and retro-digitized dictionaries, those converted from an analogue (paper) or digital (e.g., PDF) medium into a computer-readable format. Born-digital dictionaries, created directly in machine-readable formats, are excluded.
In the running text of this document, the following notations are employed:
— terms designating concepts defined in this document are in italics;
— TEI P5 terms (element names, attribute names, attribute values, etc.) are presented in a fixed-width (monospace) font, as follows:
— individual element names are enclosed in angle brackets, e.g., <entry>;
— names of nested elements are represented in XPath notation, e.g., cit/quote/bibl;
— attribute names are indicated with an @sign preceding the name of the attribute, e.g., @type;
— attribute values are enclosed in double quotation marks (" "), e.g., "domain".
Presentation of lexicographic entries in general language dictionaries
1.0 Scope
This document specifies the presentation of lexicographic entries in general language dictionaries, whether monolingual, bilingual or multilingual, following a lexicographic lemma-oriented approach, and addressed for human end-users. Concerning the modelling of the underlying data, this document follows the ISO 24613 series.
The document provides recommendations to deal with the heterogeneous structures of data presentation in lexicographic entries, both in print and digital dictionaries. This document also establishes core concepts related to the broader scope of lexicographic work.
2.0 Normative references
The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.
ISO 639 (all parts), Code for individual languages and language groups
ISO 1087, Terminology work and terminology science — Vocabulary
ISO 24613‑1, Language resource management — Lexical markup framework (LMF) — Part 1: Core model
ISO 24613‑2, Language resource management — Lexical markup framework (LMF) — Part 2: Machine-readable dictionary (MRD) model
ISO 24613‑3, Language resource management — Lexical markup framework (LMF) — Part 3: Etymological extension
ISO 24613‑4, Language resource management — Lexical markup framework (LMF) — Part 4: TEI serialization
ISO 21636‑1:2024, Language coding — A framework for language varieties — Part 1: Vocabulary
IETF BCP 47, Tags for Identifying Languages. (ed. A. Phillips; M. Davis). September 2009. Best Current Practice. URL: https://tools.ietf.org/html/bcp47
TEI P5, Guidelines for Electronic Text Encoding and Interchange. [Version number: 4.6.0]. [Last modified date: 2023-04-04]. TEI Consortium. http://www.tei-c.org/Guidelines/P5/
3.0 Terms and definitions
For the purposes of this document, the following terms and definitions apply.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
— ISO Online browsing platform: available at https://www.iso.org/obp
— IEC Electropedia: available at https://www.electropedia.org/
NOTE Terms and corresponding definitions related to lexicographic components and sub-components are listed
3.1
delimiter
separator
element used to separate different components of a lexicographic entry (3.11) or distinct entries within a dictionary (3.2)
Note 1 to entry: Delimiters help to organize information, making it easier for end-users to locate and understand the various components of a lexicographic entry.
EXAMPLE: The lemma delimiter used after a lemma (3.7), and the sense delimiter positioned before a new sense.
dictionary
<language resource management> lexicographic resource (3.13) that contains a structured collection of lexicographic entries (3.11)
Note 1 to entry: Dictionary can have a much broader meaning. The definition presented is restricted to the scope of this document.
3.3
machine-readable dictionary
MRD
electronic dictionary
computer-aided dictionary
computer-assisted dictionary
dictionary (3.2) designed to be processed and interpreted by software
Note 1 to entry: Unlike traditional dictionaries (3.2), which are intended for human use, MRDs are formatted in such a way that their contents can be efficiently accessed, manipulated, and utilized by software.
dictionary structure
structure containing a macrostructure (3.15), a microstructure (3.19) and a mediostructure (3.17)
general language
natural language (3.20) characterized by the use of linguistic means of expression independent of any specific domain
[SOURCE: ISO 1087:2019]
grammatical feature
property associated with a lexical unit (3.9) to describe one of its grammatical attributes
Note 1 to entry: Possible grammatical features include gender, number, and transitivity.
[SOURCE: ISO 24613-1:2024, 3.2, modified – lexical unit replaces word form; Note 1 to entry added, EXAMPLE removed]
headword
entry word
a lexicographic component (3.12) that serves as the main access point to a lexicographic entry (3.11)
Note 1 to entry: This term is included in Table 1.
3.8
lemma
lemmatized form
canonical form
base form
base word (deprecated term)
conventional representation of a lexical unit (3.9) chosen as the headword in a lexicographic resource (3.13) according to lexicographic conventions
Note 1 to entry: Conventions may vary between languages.
lexical unit
lexical item
meaningful lexical element within natural language (3.20)
Note 1 to entry: Although ‘lexeme’ is the term used in ISO 24613-1:2024, this document adopts the term ‘lexical unit’. This preference is based on its practical orientation, emphasizing a meaningful lexical item that is readily identifiable and applicable. This choice avoids confusion with the more abstract concept of ‘lexeme’, which is distinct from both lemma and lexical unit, as defined in ISO 24613-1:2024.
lexicographer
expert who compiles or edits a dictionary (3.2)
entry
main entry
lexicographic article
dictionary article
structured set of lexicographic components (3.12) that treat a headword (3.8) in a lexicographic resource (3.13)
lexicographic component
structural element of a dictionary entry (3.11)
Note 1 to entry: Lexicographic components can include but are not limited to headwords, definitions, examples, etymology, and usage notes.
lexicographic resource
collection of lexicographic entries (3.11)
Note 1 to entry: A lexicographic resource can be a collection of structured datasets that is human-readable as a dictionary and also can be processed as a machine-readable dictionary.
EXAMPLE: Printed dictionaries, CDs, databases.
lexicon
resource containing a collection of lexical units (3.9)
macrostructure
dictionary structure (3.4) comprising a data set with a list of lemmas (3.7)
3.16
marker
type of notation used in lexicographic entries (3.11) to provide metadata (3.18) about a lexical unit (3.9)
Note 1 to entry: Markers can indicate various aspects such as grammatical information and usage labels, helping users understand the proper use of a lexical unit. For example, in the lexicographic entry for the lexical unit ‘run’, a marker can indicate that it is a verb (v.), and another marker can label it as informal when used to mean ‘to manage’ (e.g., ‘run a business’).
mediostructure
cross-reference structure
dictionary structure (3.4) of cross-references between lexicographic entries (3.11) or their lexicographic components (3.12)
metadata
data that provides information about other data related to any element of a lexicographic resource (3.13)
microstructure
dictionary structure (3.4) of lexicographic components (3.12) within a lexicographic entry (3.11)
natural language
language that is or was in active use in a community of people, and the rules of which are mainly deduced from usage
[SOURCE: ISO 1087:2019]
orthography
systematic way of spelling or writing lexical units (3.9) that conforms to a conventionalized use
[SOURCE: ISO 24613-1:2024, 3.10, modified – The term lexemes has been changed to lexical units.]
3.22
sense component
structural sense element of a lexicographic entry (3.11)
subentry
nested entry
grouping structure for related lexicographic entries (3.11) that share a common headword
typographical convention
set of practices governing the visual presentation of lexicographic content as displayed or output
Note 1 to entry: These conventions encompass choices related to typography, such as font usage, font size, line spacing, margins, paragraph styles, text alignment, punctuation, symbols and other text design characteristics.
usage label
marker (3.16) that indicates a restricted use of a lexical unit (3.9)
Note 1 to entry: Usage labels address different dimensions of linguistic variation, such as space, time, social group, and situation (cf. ISO 21636-1:2024).
Note 2 to entry: General and specialized dictionaries employ a range of symbols and abbreviations as usage labels.
Example: Labels indicating currency or period (e.g., arch. for archaic), formality or register (e.g., inf. for informal), regionality or dialect (e.g., Am. for American, York. for Yorkshire), technicality or subject field (e.g., bot. for botanical), and textuality or genre (e.g., poet. for poetic).
4.0 An overview of lexicographic components
Table 1 describes various lexicographic components and sub-components typically found in lexicographic resources or annotation schemes.
Table 1 — Lexicographic components and sub-components
5.0 Typographical conventions
Printed and digital dictionaries extensively employ typographical conventions to delineate and clarify the relationships among various lexicographic components. These conventions include a spectrum of typographic choices, including but not limited to font type, font size, line spacing, margin settings, alignment, stylistic formatting (such as normal text, boldface, italics), and the use of punctuation marks. The application of these typographical conventions is pivotal in ensuring the uniformity and legibility of dictionaries. It is essential that each lexicographic component is presented in a manner that allows end-users to clearly identify and understand its intended structural and semantic significance, thereby distinguishing between different types of information with ease.
The role of typographical conventions in lexicography is both foundational and transformative, significantly influencing the manner in which lexicographic content is conveyed and interpreted. Table 2 enumerates a selection of these prevalent typographical practices employed across various language dictionaries, showcasing the diversity and functionality of typography in enhancing the end-user’s navigational and interpretative experience. For a comprehensive overview of the symbols and notations commonly adopted in both printed and digital dictionaries, see Annex B, which provides further insights into the symbolic lexicon integral to lexicographic works.
Table 2 — Typographical conventions generally adopted in dictionaries
Format | Description |
---|---|
Boldface | Boldface is usually used for lemmas and other lexical units within a lexicographic entry, such as compounds, phrasal verbs, to aid end-users in quickly locating them. |
Italics | Italics is usually used for usage examples, Latin units and loanwords, providing a clear distinction from the main text. |
Lightface | Lightface is usually used for pronunciation guides and notes, ensuring they are distinguishable yet not overly prominent. |
Highlighting / Colour coding | Highlighting, through varied background colours or font styles, is usually used to emphasize new or significant lexical units, as well as cautionary notes, thereby enhancing readability and navigation. |
Abbreviations | Abbreviations, a longstanding convention in both printed and digital dictionaries, require consistent use and a comprehensive list of expansions to ensure clarity and user-friendliness. |
Numbering | Numbering aids in the logical organization and easy reference of different senses within a lexicographic entry. |
Superscript number | A superscript number following lexical units indicates that these are homographic/homonymic lexical units. |
Icons | The icons can provide a visually intuitive way to indicate additional features or content types within a lexicographic entry. |
Illustrations | Non-verbal representations, such as images or diagrams, are used to visually depict concepts or provide additional context to the lexicographic content. |
Hyperlinks | A hyperlink is a clickable element in a document or webpage that takes the user to another location, such as a different page or document. In digital dictionaries, hyperlinks are usually indicated by underlined text and distinct colours. They should adhere to accessibility standards to ensure usability for all users. |
Bullet points | Bullet points are used to organize information clearly, especially for sub-definitions, examples, or related terms, indentation and bullet points can be used to create a visual enumeration of information elements. |
Marginalia / sidebar boxes | Marginalia can provide supplementary information, notes, icons, references, sidebars or info boxes can provide additional content without interrupting the reading of a full lexicographic entry. |
Charts | Extra information like verb conjugations or phonetic charts can be made interactive, allowing users to explore detailed information dynamically. |
Interactive elements | In digital dictionaries, certain text elements or symbols are interactive in that, when clicked or tapped, they trigger actions such as playing audio pronunciations, displaying translations, or revealing additional information. |
Tooltip | In digital dictionaries, hovering over a word or symbol can display a tooltip—a small pop-up box with additional information, definitions, or usage tips. |
Table A.1 outlines the data model of lexicographic entries.
Table A.1 — Descriptors, encoding and respective output
Descriptor (field designation) | LMF component (XPath) | Typical realisation/XML representation | Recommended output (harmonising the way of doing) |
---|---|---|---|
antonym | sense/xr | <xr type="antonymy"/> | Antonym with special formatting |
attitude label | sense/label | <usg type="attitude"/> | Attitude label with special formatting |
citation | sense/cit | <cit><quote>$TEXT</quote><bibl>$TEXT</bibl></cit> | Citation in quotes followed by reference in brackets |
cross-reference | sense/xr | <xr type="reference"/> | In the running text |
dating | <date>$TEXT</date> | Date with special formatting | |
lexicographic entry | /LexicalEntry/entry | <entry>$MIXEDCONTENT</entry> | New paragraph per entry |
domain label | sense/label | <usg type="domain"/> | Domain label with special formatting |
etymology | etym | <etym>$TEXT</etym> | |
example | sense/cit[@type="example"] | <cit type="example">$TEXT</cit> | Example preceded by a blank line |
frequency label | sense/label | <usg type="frequency"/> | Frequency label with special formatting |
geographic label | sense/label | <usg type="geographic"/> | Geographic label with special formatting |
gender | gramGrp/gram | <gram type="gen">$TEXT</gram> | m (masculine) f (feminine) n (neuter) Gender abbreviation for printed editions |
gloss | sense/gloss | <gloss>$TEXT</gloss> | Gloss in italics |
headword | form[@type="lemma"] | <form type="lemma">$TEXT</form> | Headword in bold |
inflected form | form[@type="inflected"] | <form type="inflected">$TEXT</form> | Inflected form in regular font |
lemma /OrthographicRepresentation/ | form[@type="lemma"]/orth | <form type="lemma"><orth>$TEXT</orth></form> | Lemma in bold |
lexicographic definition | sense/def | <def>$TEXT</def> | Definition in regular font |
meaning type label | sense/label | <usg type="meaningType"/> | Meaning type label with special formatting |
[grammatical] number | gramGrp/gram | <gram type="num">$TEXT</gram> | sg (singular) pl (plural)Number abbreviation for printed editions |
[entry] number | <entry n="$NUMBER">$TEXT</entry> | Following the lemma | |
normativity label | sense/label | <usg type="normativity"/> | Normativity label with special formatting |
note | note | <note>$TEXT</note> | Note in regular font |
orthographic form | form/orth | <orth>$TEXT</orth> | Orthographic form in regular font |
part of speech | gramGrp/gram | <gram type="pos">$TEXT</gram> | Part of speech in italics |
pronunciation /OrthographicRepresentation/ | form/pron | <pron notation="[notation]">$TEXT</pron> | Pronunciation in square brackets |
sense number | sense | <sense n="[number]"/> | Sense number using numbers |
sociocultural label | sense/label | <usg type="sociocultural"/> | Sociocultural label with special formatting |
synonym | sense/xr | <xr type="synonymy"/> | Synonym with special formatting |
temporal label | sense/label | <usg type="temporal"/> | Temporal label with special formatting |
text type label | sense/label | <usg type="textType"/> | Text type label with special formatting |
variant | form[@type="variant"] | <form type="variant"><orth>$TEXT</orth></form> | Variant in regular font |
Table B.1 outlines typographic conventions sourced from dictionaries in the ISO official languages: English, French, and Russian. It aims to present a cohesive style found in lexicographic resources, potentially guiding the harmonization of typographic conventions across various languages and formats.
NOTE 1 The symbols enumerated in this annex were selectively sourced from representative mono- and bilingual dictionaries of the respective countries. It is important to note that the list is not exhaustive and serves as an illustrative guide to commonly used symbols.
NOTE 2 In compliance with ISO standards, language and country codes used herein adhere to ISO 639 and ISO 3166, respectively. The abbreviations GB, FR, and RU correspond to the country codes as per ISO 3166-1:2020, representing United Kingdom, France, and Russia, respectively.
Academia das Ciências de Lisboa. Dicionário da Língua Portuguesa. Retrieved July 17, 2024, from https://dicionario.acad-ciencias.pt
Cambridge Dictionary online. Retrieved July 17, 2024, from https://dictionary.cambridge.org/
Collins English Dictionary. Retrieved July 17, 2024, from https://www.collinsdictionary.com/
Collins English-German Dictionary. Retrieved July 17, 2024, from https://www.collinsdictionary.com/
Diccionario Básico de la Lengua Española, 2014
Diccionario del español atual, Manuel Seco, Olímpia Andrés y Gabino Ramos. Diccionario BBVA. Retrieved July 17, 2024, from https://www.fbbva.es/diccionario/
Dictionnaire de l'Académie française. Retrieved July 17, 2024, from https://www.dictionnaire-academie.fr/
German English Dictionary
Infopédia. Dicionários da Língua Portuguesa. Retrieved July 17, 2024, from https://www.infopedia.pt/dicionarios/lingua-portuguesa
Le Petit Robert de la langue française 2017
Longman Dictionary of Contemporary English online. Retrieved July 17, 2024, from https://www.ldoceonline.com/
Merriam-Webster.com dictionary. Retrieved July 17, 2024, from https://www.merriam-webster.com/
Oxford Advanced Learners Dictionary (printed edition)
Oxford English Dictionary online. Retrieved July 17, 2024, from https://www.oed.com/
Oxford Learner's Dictionaries. Retrieved July 17, 2024, from https://www.oxfordlearnersdictionaries.com/
Real Academia Española. Diccionario de la lengua española (23rd ed.). Retrieved July 17, 2024, from https://dle.rae.es
TLFi online. Retrieved July 17, 2024, from http://atilf.atilf.fr/antonyme
Table B.1 — Lexicographic symbols
Symbol | Designation | Unicode | Function | Position | Specific usage |
---|---|---|---|---|---|
≈ ≃ ≒ | Almost/approximately equal | U+2248 U+2243 U+2252 | Indicates approximate equivalence or similarity in meaning. | preceding a lexical unit | luzidio DLPC 2001 |
' | Apostrophe | U+0027 | Indicates a gloss or equivalent of a form. | between a lexical unit | senescente DLPC 2001 |
→ | Arrow | Points to cross-references or, in etymology, presents a related lexical unit. | preceding a lexical unit | MOB Robert 2017 | |
U+002A | Marks reconstructed, unattested forms, or forms not found in corpus data. | preceding a lexical unit | adaga DLPC 2001 | ||
Black square | U+25A0 | Is used to separate different | Preceding | ONIRIQUE | |
Black diamond | U+2666 U+25CA | senses within a lexicographic entry. | different components | Robert 2017 | |
◊ | Lozenge | Separates the date of appearance of the word in the source language from its origin. | |||
Dagger (U+2020) | U+2020 | Indicates obsolete lexical units or historical usage no longer in active use. | preceding a lexical unit | taciturnous OED | |
° | Degree sign | U+00B0 | Characterizes the lexical unit as an internationally harmonized scientific-technical term. | preceding a lexical unit | |
= | Equals sign | U+003D | Indicates that the lexical unit is an equivalent or synonym. | preceding a lexical unit | DNA = Deoxyribonucleic acid |
! | Exclamation mark | U+0021 | Indicates that the lexical unit has been coined by means of translation. | preceding a lexical unit | |
> | Greater-than sign | U+003E | Indicates that the form/word following the symbol comes from the form/word preceding the symbol in etymological components. | ||
< | Less-than sign | 003C | Indicates the historical derivation of a lexical unit. It suggests that the lexical unit or form preceding the symbol is derived from the lexical unit or form following it. | after a lexical unit | vacation < Old French (also modern French) vacation OED |
× | Multiplication sign | U+00D7 | Indicates that exists an overlap. | ||
≠ | Not equal | U+2260 | Indicates antonymy or significant difference in meaning. | preceding a lexical unit | |
+ | U+002B | Can be used to show compound word formation. For example, note + book = notebook. | preceding a lexical unit | notebook OED | |
§ | Section sign | U+00A7 | Indicates that the designation is legally protected. | preceding | |
™ | Indicates that a lexical unit also represents a trademark. | after a lexical unit | Pladur® DLP 2024 | ||
~ | Tilde | U+007E | Replaces the lemma or a specific lexical unit throughout a lexicographic entry or part of an entry. | instead of a lexical unit | desacuerdo Presença/Langenscheidt |
‖ | Parallel symbol or double vertical line | Indicates different examples. | at the end of the first example and before beginning the next example | persian dictionary and arabic dictionary | |
; | Distinguishes different pronunciations, subsenses, synonyms, and other lexicographic components. | at the end of the first pronunciation and before the second one | oxford dictionary | ||
$ | Shows the american pronunciation. | ||||
... | Indicates, in citation, that part of the text has been omitted. | persian dictionary and arabic dictionary | |||
: | Indicates the beginning of the definitions. | webster dictionary | |||
Indicates a division of parts of speech. | oxford dictionary | ||||
⇨ | Indicates cross reference. | Oxford Dictionary Persian Dictionary | |||
/ | Separates different pronunciations. | ||||
( ) | Parentheses | Enclose additional information, clarifications, or contextual details that complement the main text. | |||
[ ] | Square brackets | Enclose phonetic transcriptions, providing a standardized method for representing pronunciation. Also for some complementary information, such as material related to etymology. | Persian Dictionary | ||
< > | Angle brackets | Enclose lexical units in discussions of etymology, particularly to indicate a lexical unit or form in an older language from which the current word is derived. Sometimes used to narrow down the domain. | |||
/ / | Slashes | Enclose pronunciation or phonemic transcriptions to indicate the representation of sounds. |
The examples have been selected for illustrative purposes and are specific to the language(s) in question. Real lexicographic examples in this document demonstrate the application of LMF serialization and their associated presentations. They do not imply any responsibility on the part of the publishers.
NOTE 1 The standard sets guidelines that can affect the online display of dictionary content, potentially using technologies like XSLT, CSS, etc.
NOTE 2 Examples of creating XSLT and CSS for rendering lexicographic entries encoded according to the TEI/TEI Lex-0 guidelines in a browser can be found on GitHub. This approach offers practical benefits, such as avoiding ISO intellectual property restrictions and facilitating easier maintenance of the resource. A practical implementation is available on GitHub at:
https://github.com/anacastrosalgado/lexicalresources/tree/master/Schemas/ISO1951
Academia das Ciências de Lisboa. Dicionário da Língua Portuguesa. Retrieved July 17, 2024, from https://dicionario.acad-ciencias.pt
Cambridge Dictionary online. Retrieved July 17, 2024, from https://dictionary.cambridge.org/
Collins English Dictionary. Retrieved July 17, 2024, from https://www.collinsdictionary.com/
Collins English-German Dictionary. Retrieved July 17, 2024, from https://www.collinsdictionary.com/
Diccionario del español atual, Manuel Seco, Olímpia Andrés y Gabino Ramos. Diccionario BBVA. Retrieved July 17, 2024, from https://www.fbbva.es/diccionario/
Dicionário Espanhol-Português/Português-Espanhol, Presença/Langenscheidt, 2000 (printed edition)
Dictionnaire de l'Académie française. Retrieved July 17, 2024, from https://www.dictionnaire-academie.fr/
German English Dictionary
Infopédia. Dicionários da Língua Portuguesa. Retrieved July 17, 2024, from https://www.infopedia.pt/dicionarios/lingua-portuguesa
Le Petit Robert de la langue française 2017
Longman Dictionary of Contemporary English online. Retrieved July 17, 2024, from https://www.ldoceonline.com/
Merriam-Webster.com dictionary. Retrieved July 17, 2024, from https://www.merriam-webster.com/
Oxford Advanced Learners Dictionary (printed edition)
Oxford English Dictionary online. Retrieved July 17, 2024, from https://www.oed.com/
Oxford Learner's Dictionaries. Retrieved July 17, 2024, from https://www.oxfordlearnersdictionaries.com/
Real Academia Española. Diccionario de la lengua española (23rd ed.). Retrieved July 17, 2024, from https://dle.rae.es
TLFi online. Retrieved July 17, 2024, from http://atilf.atilf.fr/
Ozhegov’s is a general-purpose explanatory dictionary of Russian and the other one (Denisov’s) is a 'dictionary of collocability’. Here are the bibliographic entries for these two:
S.I. Ozhegov, N. Yu. Shvedova “Tol’kovyi slovar’ russkogo yazyka”, Mockva, 1998
P.N. Denisov, V.V. Morkovkin (eds.), “Slovar’ sochetaemosti slov russkogo yazyka”, Moskva, Astrel’, AST, 2005
[1] Consortium T.E.I., ed. TEI P5: Guidelines for Electronic Text Encoding and Interchange. TEI Consortium. http://www.tei-c.org/Guidelines/P5/ ([Version number and dates to be completed when finalising the document]).
[2] Tasovac T., Romary L., Banski P., Bowers J., de Does J., Depuydt K. et al. 2018. TEI Lex-0: A baseline encoding for lexicographic data. DARIAH Working Group on Lexical Resources. https://dariah-eric.github.io/lexicalresources/pages/TEILex0/TEILex0.html ([Version number and dates to be completed when finalizing the document]).
[3] BCP 47 Tags for Identifying Languages. A. Phillips; M. Davis. IETF. September 2009. IETF Best Current Practice. URL: https://tools.ietf.org/html/bcp47
[4] IEFT BCP 47, Tags for Identifying Languages (ed. A. Phillips, M. Davis). September 2009. Best Current Practice. https://tools.ietf.org/html/bcp47
[5] Romary, L., & Wegstein, W. (2012), Consistent modelling of heterogeneous lexical structures. Journal of the Text Encoding Initiative, 3. doi:10.4000/jtei.540.
[6] Costa, C., Roche, C., and Salgado, A. (2022). Standards for Representing Lexicographic Data: An Overview. Version 1.0.0. DARIAH-Campus. [Training module]. https://elexis.humanistika.org/id/REhOykBU7pPs5zOAENdah
[7] Salgado, A., Costa, R., & Tasovac, T. (2019). Improving the consistency of usage labelling in dictionaries with TEI Lex-0. Lexicography: Journal of ASIALEX, 6(2), 133–156. doi:10.1007/s40607-019-00061-x.