ISO/DIS 9515
ISO/TC 37/SC 4
Secretariat: KATS
Date: 2025-12-18
Language resource management — Vocabulary
Voting begins on: 2026-02-13 Voting terminates on: 2026-05-08
Gestion des ressources linguistiques — Vocabulaire
Voting begins on: 2026-02-13 Voting terminates on: 2026-05-08
DIS stage
© ISO 2025
All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.
ISO copyright office
CP 401 • Ch. de Blandonnet 8
CH-1214 Vernier, Geneva
Phone: + 41 22 749 01 11
E-mail: copyright@iso.org
Website: www.iso.org
Published in Switzerland
Contents
3.1 General Linguistic Terms 1
3.2 Language Resource Management 10
3.4 Morphosyntactic Annotation Framework (MAF) 17
3.5 Linguistic Annotation Framework (LAF) 19
3.6 Syntactic Annotation Framework (SynAF) 20
3.7 Semantic Annotation Framework (SemAF) 22
3.8 Comprehensive Annotation Framework (ComAF) 41
3.9 Lexical Markup Framework (LMF) 41
3.10 Multilingual Information Framework 43
3.11 Persistent Identification and Sustainable Access (PISA) 43
3.12 Infrastructure for Component Metadata 48
3.13 Corpus Query Lingua Franca (CQLF) 55
3.14 Word Segmentation of Written Texts 58
3.15 Transcription of Spoken Language 59
3.16 Controlled Natural Language (CNL) / Controlled Human Communication (CHC) 59
3.18 Corpus Annotation Project Management 64
ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.
The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).
Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).
Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.
For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISO's adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.
This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee SC 4, Language resource management.
Any feedback or questions on this document should be directed to the user’s national standards body. A complete listing of these bodies can be found at www.iso.org/members.html.
The main purpose of this document is to provide a systematic description of the concepts related to language resources and language resource management and to clarify the use of the terms in this field.
This document is addressed to anyone concerned with language resource management and in particular users of the standards published by ISO/TC 37/SC 4.
The layout follows the directions given in ISO 10241-1. Thus, the elements of an entry appear in the following order:
Language resource management — Vocabulary
1.0 Scope
This document provides the terms and definitions for the standards of ISO/TC 37/SC 4 Language resource management.
2.0 Normative references
There are no normative references in this document.
3.0 Terms and definitions
The terminological entries in this document are presented in a mixed order under headings reflecting the subjects covered by the work of ISO/TC 37/SC 4. Systematic order is applied where possible in such a way that concepts related hierarchically are listed coherently with their entry numbers reflecting the positions in the concept system followed by concepts related associatively. Where there are concept relations across subjects, allocation of a concept to the respective subject is preferred to the coherent display of concept relations/concept systems. Concept relations are indicated by cross-references throughout this document.
ISO and IEC maintain terminology databases for use in standardization at the following addresses:
- ISO Online browsing platform: available at https://www.iso.org/obp
- IEC Electropedia: available at https://www.electropedia.org/
3.1 General Linguistic Terms
linguistic structure
composition of a language (3.16.1) at the level of sound, word (3.1.9.1), phrase (3.1.25), sentence (3.1.27), meaning, and discourse (3.7.4.1)
Note 1 to entry: The science of language is understood to consist of phonology (sound), morphology (3.1.6) (word units), syntax (3.1.24) (sentential structure), semantics (meaning, information), and pragmatics (discourse, context).
phoneme
smallest sound unit that can be segmented from the acoustic flow of speech and twhich can function as semantically distinctive units
EXAMPLE For English, examples for phonemes are /g/ and /k/ as in gap: cap or /m/ and /t/ in map: tap; these word pairs constitute minimal pairs in English.
[SOURCE: [2]]
homophone
one of two or more words (3.1.9.1) that are pronounced the same but differ in meaning and sometimes in spelling
[SOURCE: ISO 19104, 4.15, modified — Note 1 to entry has been removed.]
quasi-homophone
word (3.1.9.1) which differs from another by one or two phonemes (3.1.2)
Note 1 to entry: There can be one phoneme more or less in one of the two quasi-homophones (e.g.: aft-after), one different phoneme (e.g.: check-deck, feed-feet), or 2 different phonemes (e.g.: flap-slat).
phoneme confusion
confusion due to a phoneme (3.1.2) approximately or incorrectly pronounced, and interpreted as another phoneme according to the mother tongue of the receptor
Note 1 to entry: Phonemes can exist in one language (3.16.1) and not in other languages.
Note 2 to entry: Phonemes can be pronounced (spoken, emitted) with multiple accentuations, and be perceived differently by recipients (listeners) not necessarily receptive to the same phonetic and phonological systems.
morphology
description of the structure and formation of words (3.1.9.1)
Note 1 to entry: Morphology is traditionally divided into:
- word-formation (3.1.6.2) dealing with the formation of complex lexemes (3.1.9) out of simpler lexemes: by means of derivation (3.1.19) (often signalled by affixation (3.1.8.2.1), i.e. addition of a morpheme (3.1.8)) or by means of compounding (3.1.20) (combining two or more lexemes);
- inflection (3.1.6.1) that creates inflected forms (3.1.14).
inflection
branch of morphology (3.1.6), dealing with contextual realizations of lexemes (3.1.9) as inflected forms (3.1.14)
Note 1 to entry: Inflection is a grammatical rather than lexical process.
word-formation
branch of morphology (3.1.6), dealing with the creation of new lexemes (3.1.9) by the processes of derivation (3.1.19) or compounding (3.1.20)
morph
surface form represented by a unique morpheme (3.1.8)
EXAMPLE In English, the morphs of the plural morpheme “-s” include “-s”, “-en”, and “-NULL” (as in “boys”, “oxen”, and “sheep”), where “–NULL” has no unique surface form. Thus, the word “boys” consists of the two morphs, “boy” and “-s”, whereas the morphemes corresponding to the morphs “ox” and “-en” are “ox” and “-s”, respectively.
morpheme
exponent that signals a modification of a lexeme (3.1.9)
Note 1 to entry: There are two sub-types of morphemes: free morphemes (3.1.8.1) and bound morphemes (3.1.8.2).
Note 2 to entry: This definition adheres to a lexeme-based approach to morphology where it is the lexeme, not the morpheme, that encodes the linguistic sign. On this approach, the morpheme is a unit of form (an exponent) that marks various kinds of modifications (e.g. derivation or inflection) of a lexeme.
free morpheme
morpheme (3.1.8) that can be used as a word (3.1.9.1) by itself
EXAMPLE Given the word “goodness,” “good” is a free morpheme, whereas “-ness” is not. The latter is a bound morpheme (3.1.8.2).
bound morpheme
morpheme (3.1.8) that appears only together with one or several other morphemes
EXAMPLE 1 Chinese: 伟 means “great,” but cannot stand by itself as a word in text (3.16.9). Instead, it is used as a constituent element of many words, such as 伟大 (“great”), 伟人 (“giant”), and 雄伟 (“majesty”).
EXAMPLE 2 Korean: the suffix “-e”, which is equivalent to the English preposition “to” — as in “hakkyo-e” (to school) — is a bound morpheme.
affix
bound morpheme (3.1.8.2) which may be added to a stem (3.1.17) or a lexeme (3.1.9)
Note 1 to entry: Affixes can be classified into several sub-types such as prefix, suffix, infix and circumfix. Affixes can be derivational or they can be inflectional or agglutinative.
ending
〈Japanese text〉 agglutinative affix (3.1.8.2.1) of a verb or adjective
Note 1 to entry: Verbs and adjectives end with agglutinative forms, called “endings”. These endings may be a negative form, an adverbial form, a base form, an adnominal form, an assumption form or an imperative form.
particle
〈Japanese text〉 grammatical affix (3.1.8.2.1) agglutinated mostly to nominal forms but sometimes to other free-standing lexical items (3.2.3)
Note 1 to entry: The grammatical category particle can be treated as a part of speech (3.1.23).
EXAMPLE The noun phrase 学校へ(gakkoue) is analysed into a noun 学校 (gakkou) and a particle へ(e). The verb phrase 寒いね (samuine, ‘It is very cold, isn't it?’) is analysed into a verb 寒い (samui) and a particle ね(ne) which corresponds to the tag ‘isn't it?’.
lexeme
abstract, fundamental unit in the lexicon (3.2.1.1) of a language, comprising semantic, formal (phonetic and/or graphemic) and grammatical information
Note 1 to entry: A complex lexeme is the result of word-formation (derivation or compounding) processes; a simple lexeme can be thought of as the base for such processes. In a lexical entry, a lexeme is identified by a lemma (3.8). Word-forms (3.5) are results of the interaction of lexemes with the grammatical system of the given language.
word
lexeme (3.1.9) that has, as a minimal property, a part of speech (3.1.23)
compound
word (3.1.9.1) built from two or more lexemes (3.1.9)
Note 1 to entry: A compound may be endocentric if it has a head (i.e. the fundamental part that contains the basic meaning of the whole compound) and modifiers (which restrict this meaning), or exocentric if it does not have a head. A compound can be long. There are two main sub-types of compound according to their degree of lexicalization (3.14.3): word compound (3.1.9.1.1.1) and phrasal compound (3.1.9.1.2).
word compound
compound (3.1.9.1.1) whose overall meaning is not totally predictable from its constituent parts
EXAMPLE “Hotdog,” “ice-cream,” “blackboard”.
phrasal compound
word (3.1.9.1) consisting of two or more lexemes (3.1.9), the meaning of which is predictable from its constituent elements
EXAMPLE “Apple pie” in English is a phrasal compound composed of two lexemes, “apple” and “pie”, whose meanings are preserved in the meaning of the compound.
Note 1 to entry: Idioms use two or more lexical items (3.2.3), but do not compose a phrasal compound.
Note 2 to entry: A phrasal compound might be thought of as phrases (3.1.25) by some linguists. In practice, however, there is not always a clear distinction between a word compound (3.1.9.1.1.1) and a phrasal compound, or between a phrasal compound and a phrase, due to the fuzziness of semantic predictability and the degree of lexicalization (3.14.3). Lexico-statistics — word frequency in particular — will play an important role in this respect.
adnoun
ADN
non-conjugating word (3.1.9.1) that modifies a noun
Note 1 to entry: Adnouns modify nouns, as adverbs modify verbs.
EXAMPLE 1 | <Japanese> |
a. あらゆる 国 | |
arayuru kuni | |
ADN N | |
‘every country’ | |
b. 好きな 花 | |
suki+na hana | |
ADNst+SX N | |
‘favourite flower’ | |
EXAMPLE 2 | <Korean> |
a. 새 옷 | |
sae ot | |
ADN noun | |
‘new clothes’ | |
b. 빨간 옷 | |
bbalga+n ot | |
ADJst+GX N | |
‘red clothes’ |
eojeol
malmaldi
〈Korean text〉 word (3.1.9.1) or its variant word form (3.1.13) agglutinated with grammatical affixes (3.1.8.2.1)
Note 1 to entry: White space (space between characters) helps to segment text (3.16.9) into eojeols.
EXAMPLE | 내가 사과를 먹었다 | |
nae+ga sagwa+reul meok+eot+da | ||
pronoun+GX noun+GX Vst+GX+GX | ||
‘I’+SBJ ‘apple’+OBJ ‘eat’+PST+DCL = ‘I ate (an) apple’ | ||
Note 2 to entry: This sentence consists of three eojeols: 내가, 사과를 and 먹었다, each of which is separated by white space. The acronyms GX, SBJ, OBJ, PST and DCL in the example above stand for grammatical affix, subject, object, past tense and declarative sentential type, respectively. The pronoun 내 is a variant form of the pronoun 나 referring to the speaker. 먹었다 is an eojeol and at the same time is a word form agglutinated with two grammatical affixes 었 and 다 to a verb stem (3.1.17) 먹.
multiword expression
MWE
lexeme (3.1.9) made up of a sequence of two or more lexemes that has properties that are not necessarily predictable from the properties of the individual lexemes or their normal mode of combination
EXAMPLE “To kick the bucket”, an idiomatic expression which means to die rather than to hit a bucket with one's foot. An idiomatic expression is a subtype of MWE whose properties are not predictable from the properties of the individual lexemes.
Note 1 to entry: An MWE can be a compound (3.1.9.1.1), a fragment of a sentence (3.1.27), or a sentence. The group of lexemes making up an MWE can be continuous or discontinuous. It is not always possible to mark an MWE with a part of speech (3.1.23).
lemma
lemmatized form
canonical form
base form
conventional form chosen to represent a lexeme (3.1.9)
Note 1 to entry: In European languages, the lemma is usually the singular if there is a variation in number, the masculine form if there is a variation in gender, and the infinitive for all verbs. In some languages, certain nouns are defective in the singular form; in these cases, the plural is chosen. For verbs in Arabic, the lemma is usually deemed to be the third person singular with the accomplished aspect.
Note 2 to entry: The term “lemma” is most often used in the context of corpora, as a device to capture the identity of tokens (3.4.5) and establish basic correspondence between a token and a lexical entry (3.2.2). The term that corresponds to lemma in the context of lexicons (3.2.1.1) is “headword”. Mismatches between the two are possible due to the varying macro- and microstructure of lexical entries. In order to handle such mismatches, apart from lemmas, direct references to dictionary entries are sometimes added to tokens or word forms (3.1.13) in corpora.
lemmatization
process of determining the lemma (3.1.10) for a given word form (3.1.13) in a context
EXAMPLE Given the word “found” in English, lemmatization results in “find” as its lemma.
word sense
meaning associated with a lexeme (3.1.9) in a context
Note 1 to entry: The ‘river bank’ sense of bank and the ‘financial institution’ sense of bank are considered to be two different word senses (3.1.12), or lexical units, with the same word form (3.1.13), or lexeme (3.1.9). I called him on the radio and Call me a taxi are associated to different word senses of the lexeme call. Unrelated senses, as in bank, are called homonyms. Senses of the same word form or lexeme which are clearly related (and can be difficult to distinguish) are called polysemes, e.g. Coins with an image of the king, preoccupied with body image, evokes a strong mental image.
word form
morpho-syntactic unit
abstract instantiation of alexeme (3.1.9) with the values of morphosyntactic feature (3.1.15) fixed in a syntactic context
EXAMPLE In English, the strings “find”, “finds”, “found” and “finding” are word forms of the word “find”.
Note 1 to entry: Word forms may have no acoustic or graphic realization, or may correspond to one or more tokens (3.4.5).
Note 2 to entry: Word-forms can have no acoustic or graphic realization, or can correspond to one or more tokens, not necessarily forming a contiguous sequence.
inflected form
concrete form that a lexeme (3.1.9) can take when used in a sentence (3.1.27) or a phrase (3.1.25)
morphosyntactic feature
feature induced from either the inflected form (3.1.14) of a lexeme (3.1.9) or from its syntactic context, or both
EXAMPLE “grammaticalGender”.
Note 1 to entry: Universal Dependencies (see [4]) offer a set of general and language-specific features and values, designed for pragmatically uniform cross-linguistic grammatical description.
word structure
internal structure of a word (3.1.9.1) resulting from the morphological analysis
Note 1 to entry: In agglutinative languages, such as Korean, Japanese and Turkish, a word may consist of a sequence of morphemes (3.1.8), with a comparatively high morpheme-per-word ratio, where each affix (3.1.8.2.1) involved (both derivational and inflectional) typically expresses a particular grammatical meaning in a clear, one-to-one way. The structure of a word in these languages can be very sophisticated, with free morphemes (3.1.8.1) and separate affixes as its constituent elements.
stem
linguistic unit whose form is smaller than or equal to the form of a single lexeme (3.1.9) and that may be affected by an inflectional, agglutinative, compositional or derivational process
agglutination
process of concatenating one or more affixes (3.1.8.2.1) to a stem (3.1.17)
derivation
change in the form of a word (3.1.9.1) to create a new word
Note 1 to entry: The change is usually done by modifying the stem (3.1.17) or by affixation.
compounding
word formation in which a new word is formed by adjoining at least two lexemes (3.1.9), in their original forms or with slight transformations
abbreviation
abbreviated form
designation that is formed by omitting parts from its full form and that represents the same concept (3.12.1.3)
[SOURCE: ISO 1087, modified — Note 1 to entry has been deleted.]
borrowing
process of word formation in which a linguistic expression is adopted from another language (3.16.1), usually when no term (3.12.1.4) exists for the new object (3.7.4.5) or concept (3.12.1.3)
part of speech
POS
grammatical category
word class
category assigned to a word (3.1.9.1) based on its grammatical and semantic properties
EXAMPLE Noun, verb.
measure word
〈Chinese text〉 part of speech (3.1.23) defining, along with numbers, the quantity (3.7.9.1) of a given object, or identifying specific objects with demonstrative pronouns such as “this” and “that”
Note 1 to entry: Whereas English speakers say “one person” or “this person”, Chinese speakers say respectively 一个人 (yi ge ren; numeral + measure word + noun; one person) or 这个人 (zhe ge ren; demonstrative pronoun + measure word + person; this person), where 个 (ge) is a measure word.
Note 2 to entry: A set of “verbal measure words” is used to count the number of times an action occurs, rather than the number of items. For example, in the sentence 我去过三次北京 (wo qu guo san ci Beijing; pronoun + verb + auxiliary word + numeral + measure word + proper noun; I have been to Beijing three times), the 次(ci) functions as a measure word to combine with a numeral 三 to derive the adverb三次 (sanci) that modifies the verb 去(qu).
syntax
way in which word forms (3.1.13) are interrelated and/or grouped together into phrases (3.1.25), thus capturing the relations that exist between those units
phrase
group of words (3.1.9.1) that perform a grammatical function (3.6.3) and that form a conceptual unit within a sentence (3.1.27)
Note 1 to entry: Empty phrases are permitted (being non-realised pronouns, sometimes marked as “pro”, and having the role of subjects in clauses (3.1.26)). A phrase is typically named after its syntactic head (3.6.2.3), for example noun phrases, verb phrases, adjective phrases, adverbial phrases and prepositional phrases. Phrases have been informally described as “bloated words”, in that the parts of the phrase added to the head elaborate and specify the reference of the head. In our model, a phrase is a special case of a constituent (3.6.2).
bunsetsu
〈Japanese text〉 phrase (3.1.25) without internal modifying relations
EXAMPLE | The sentence 私は学校へ早く行きました (I went to school early) consists of four bunsetsus: 私は(watashiwa), 学校へ (gakkoue), 早く(hayaku) and 行きました(ikimashita) in which | |
私(watashi) | is a pronoun, | |
は(wa) | is a particle (3.1.8.2.1.2), | |
学校(gakkou) | is a noun, | |
へ(e) | is a particle, | |
早く(hayaku) | is an adjective in adverbial usage, | |
行き(iki) | is a verbal stem (3.1.17) followed by | |
まし(mashi) | is an auxiliary verb denoting politeness, and | |
た(ta) | is an auxiliary verb indicating the past tense. | |
Note 1 to entry: A bunsetsu normally consists of a noun plus its particle(s) or a verb plus its ending(s) (3.1.8.2.1.1), auxiliary verb(s) or particle(s) as shown in the example above.
noun phrase
NP
group of words that function together syntactically as a noun
Note 1 to entry: An NP typically consist of a noun, one or more determiners, and head modifiers. Other cases include NPs consisting of a personal pronoun, a proper name or a conjunction of nouns instead of a single noun.
noun phrase head
DEPRECATED: head
noun or a conjunction of nouns that forms the central element of an NP (3.1.25.2)
restrictor
part of an NP (3.1.25.2) consisting of the noun phrase head (3.1.25.2.1) and modifiers (3.6.2.2) (if present)
clause
group of phrases (3.1.25)
Note 1 to entry: A clause usually contains a predicate.
Note 2 to entry: A clause can be either a main clause (3.1.26.1) or a subordinate clause (3.1.26.2). In languages (3.16.1) distinguishing finiteness, clauses whose predicate is a verb can be either finite or non-finite, depending on the form of the verb. A main clause alone can build a complete sentence (3.1.27). In the SynAF model, a clause is a special case of a constituent (3.6.2).
main clause
clause (3.1.26), which can act on its own as a complete sentence (3.1.27)
Note 1 to entry: In languages (3.16.1) distinguishing finiteness, the main clause is usually finite.
EXAMPLE The train is late.
subordinate clause
clause (3.1.26) which fulfils a grammatical function (3.6.3) in a phrase (3.1.25) or in another clause
EXAMPLE A relative clause modifies the head noun of a nominal phrase.
Note 1 to entry: A subordinate clause usually does not act on its own as a sentence (3.1.27), but is part of a larger sentence.
sentence
related group of word forms (3.1.13) containing a predication
Note 1 to entry: A sentence consists of one or more clauses (3.1.26), usually expressing a complete thought and forming the basic unit of discourse structure (3.7.4.2). When describing speech, it is common to talk about utterances (3.7.2.2) rather than sentences.
script
set of graphic characters (3.1.33) used for the written form of one or more languages (3.16.1)
EXAMPLE Hiragana, Katakana, Latin and Cyrillic.
Note 1 to entry: The description of scripts ranges from a high-level classification such as hieroglyphic or syllabic writing systems versus alphabets to a more precise classification like Roman versus Cyrillic. Scripts are defined by a list of values taken from ISO 15924.
Note 2 to entry: A script, as opposed to an arbitrary subset of characters, is defined in distinction to other scripts; it is possible that readers of one script are unable to read another script easily, even where there is a historic relation between them.
[SOURCE: ISO/IEC 10646, 3.48, modified — Example and Note 1 to entry have been added.]
transcription
〈process〉 modelling of spoken language (3.15.1) by means of written symbols
transcription
〈process result〉 result of the process of transcription (3.1.29)
grapheme
minimal unit in a written language
EXAMPLE Letter, pictogram, ideogram, numeral, punctuation.
homograph
each of two or more word forms (3.1.13) or words (3.1.9.1) with identical spelling but representing different concepts (3.12.1.3) (semantic homography) or syntactic functions (syntactic homography)
graphic character
character
element of a writing system, whether or not alphabetical, that represents a grapheme (3.1.31), a syllable, a word (3.1.9.1) or even prosodic characteristics of the language, by using graphical symbols (letters, diacritical marks, syllabic signs, punctuation marks, prosodic accents, etc.) or a combination of these signs (a letter having an accent or a diacritical mark)
EXAMPLE a, B, ω or Γ are, therefore, characters as well as basic letters.
3.1.1 Language Resource Management
lexical resource
lexical database
database consisting of one or several lexicons (3.2.1.1)
lexicon
lexical resource (3.2.1) comprising a collection of lexical entries (3.2.2) for a language (3.16.1)
lexical entry
container for managing a set of word forms (3.1.13) and possibly one or more meanings to describe a lexeme (3.1.9)
lexical item
entry in a lexicon (3.2.1.1) that is a lexeme (3.1.9) or one of its variant forms
Note 1 to entry: Headed by a lemma (3.1.10), each lexical item may be either a free-standing word (or one of its variant word form (3.1.13)) or a bound (non-free-standing) form such as stems (3.1.17) and affixes (3.1.8.2.1).
primary data
electronic representation of language data
EXAMPLE Digital representations of text (3.16.9), transcription (3.1.29) of speech, gestures or multimodal dialogue (3.7.2.1).
Note 1 to entry: Typically, primary data objects are addressed by “locations” in an electronic file, for example, the span of characters comprising a sentence (3.1.27) or word, or a point at which a given temporal event begins or ends (as in speech annotation). More complex data objects may consist of a list or set of contiguous or non-contiguous locations in primary data.
Note 2 to entry: Semantic annotation (3.2.7.3) may relate to non-verbal or multimodal data, such as stretches of spoken dialogue with accompanying gestures and facial expressions, and even gestures and/or facial expressions without any accompanying speech.
annotate
add information to primary data (3.2.4)
annotation
〈process〉 adding information to primary data (3.2.4), independent of its representation (3.2.8)
annotation
〈markup〉 information added to primary data (3.2.4), independent of its representation (3.2.8)
segmentation annotation
annotation (3.2.7) that delimits linguistic elements that appear in the primary data (3.2.4)
Note 1 to entry: These elements include (1) continuous segments (appearing contiguously in the primary data), (2) super- and sub-segments, where groups of segments will comprise the parts of a larger segment (e.g. contiguous word segment typically comprise a sentence segment), (3) discontinuous segments (linking continuous segments), and (4) landmarks (e.g. timestamp) that note a point in the primary data. In current practice, segmental information may or may not appear in the document containing the primary data itself.
linguistic annotation
annotation (3.2.7) that provides linguistic information about the segments in the primary data (3.2.4)
EXAMPLE Morphosyntactic annotation in which a part of speech (3.1.23) and lemma (3.1.10) are associated with each segment in the data.
Note 1 to entry: The identification of a segment as a word, sentence (3.1.27), NP (3.1.25.2), etc. also constitutes linguistic annotation. In current practice, when it is possible to do so, segmentation and identification of the linguistic role or properties of that segment are often combined (e.g. syntactic bracketing, or delimiting each word in the document with an XML element (3.12.4.3) that identifies the segment as a word or sentence).
semantic annotation
annotation (3.2.7) which contains information about the meaning of a segment or region (3.7.5.1) of primary data (3.2.4)
dependency annotation
annotation (3.2.7) that encodes the dependency relations (3.6.4) between character spans (3.13.7)
Note 1 to entry: An example of a dependency relation (see ISO 24615-1:2014, 3.5) is one between a verb and its subject or direct object, between an attributive adjective and its head noun, or between a preposition and the head of its dependent NP (3.1.25.2). Dependency relations may be defined at the word-level alone, or may involve higher-level syntactic constructs, in which case it is possible to speak of mixed hierarchical-dependency annotations.
hierarchical annotation
annotation (3.2.7) that encodes the relationship of dominance (often also precedence) necessary to define syntactic trees (3.6.1.1) over character spans (3.13.7)
Note 1 to entry: Annotating hierarchical relationships requires only the relation of dominance to be indicated. Precedence is typically implicit in the ordering of character spans.
simple annotation
annotation (3.2.7) that constitutes a single information package whose interpretation is not dependent on other annotations
Note 1 to entry: This definition is intended to distinguish the simplest (“tabular”) kind of annotation from more complex relational structures (providing hierarchical, dependency, or alignment information); simple annotations are the only kind of annotations present at the linear level of complexity.
stand-off annotation
annotation (3.2.7) layered over primary data (3.2.4) and serialized in a document separate from that containing the primary data
Note 1 to entry: Stand-off annotations refer to specific locations in the primary data, by addressing character offsets, elements, etc. to which the annotation applies. Multiple stand-off annotation documents for a given type of annotation can refer to the same primary document (e.g. two different part of speech annotations for a given text (3.16.9)).
representation
format in which the annotation (3.2.7) is rendered, independent of its content
EXAMPLE XML (3.12.4.1), list or bracketed format, tab-delimited text (3.16.9).
3.1.2 Feature Structures
feature value
value
entity or aggregation of entities that characterize some property or aspect of another entity
Note 1 to entry: There are two kinds of feature values: atomic value (3.3.1.1) and complex value (3.3.1.2).
atomic value
feature value (3.3.1) without internal structure
Note 1 to entry: Feature structure (3.3.7) and collection (3.3.2) are not atomic values.
complex value
feature value (3.3.1) represented either as a feature structure (3.3.7) or as collection (3.3.2)
admissible feature value
admissible value
value restriction
range restriction
feature value (3.3.1) that the value of an admissible feature (3.3.15) must be subsumed by in feature structure (3.3.7) of a given type (3.3.22)
default value
feature value (3.3.1) otherwise assigned to a feature (3.3.5) when one is not specified
EXAMPLE Masculine is the default value of the grammatical gender in Dutch.
Note 1 to entry: A feature structure (3.3.7) may not bear a feature without a corresponding value.
collection
〈Feature structures〉 feature value (3.3.1) consisting of potentially many values, organized as a list, set or bag (3.3.3)
Note 1 to entry: A list is an ordered collection of entities (3.7.3.4) some of which may be identical. A set is an unordered collection of unique entities. A bag (3.3.3) is an unordered collection of entities that may or may not be unique; it is sometimes referred to as a bag.
bag
multiset
triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S
Note 1 to entry: A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can occur more than once).
underspecification
provision of partial information about a feature value (3.3.1)
Note 1 to entry: An underspecification generally subsumes one of a range of candidate values that could be resolved to a single value through subsequent constraint resolution. See subsumption (3.3.12).
feature
property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding feature value (3.3.1)
Note 1 to entry: The combination of feature and feature-value constitutes a feature specification (3.3.6). For example, number is a feature, singular is a value, and a pair <number, singular> is a feature specification.
feature specification
pairing of a feature (3.3.5) with a feature value (3.3.1) in a feature structure (3.3.7) description
feature structure
record structure that associates one feature value (3.3.1) to each of a collection of features (3.3.5)
Note 1 to entry: Each feature value is either a feature structure or a simpler built-in (3.3.17) such as a string.
Note 2 to entry: Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature structures (3.3.7.1).
empty feature structure
feature structure (3.3.7) that contains no information
Note 1 to entry: An empty feature structure subsumes all other feature structures.
typed feature structure
feature structure (3.3.7) labelled by a type (3.3.22)
Note 1 to entry: In the graph notation (3.3.9), each node (3.5.5.1) is labelled with a type. In the matrix notation (3.3.8), a type is ordinarily placed at the upper left corner of the inside of the pair of square brackets that represents a typed feature structure. In XML (3.12.4.1) notation, the type is supplied as the feature value (3.3.1) of a type attribute on the <fs> element.
matrix notation
attribute-value matrix
AVM
notation that uses square brackets to represent feature structures (3.3.7)
Note 1 to entry: In a matrix notation, each row represents a feature specification (3.3.6), with the feature name and the feature value (3.3.1) separated by a colon (:), space ( ) or the equals sign (=).
graph notation
notation of feature structure (3.3.7) in a single rooted graph (3.5.5)
path
〈feature structures〉 sequence of labeled arcs connecting node (3.5.5.1) in a graph (3.5.5)
incompatibility
relation between two feature structures (3.3.7) which have conflicting types (3.3.22) or at least one common feature (3.3.5) with incompatible feature values (3.3.1)
Note 1 to entry: Two feature structures that are incompatible cannot be unified. The empty feature structure (3.3.7.1) is compatible with any other feature structure.
subsumption
relationship between two feature structures (3.3.7) in which one is more specific than the other
Note 1 to entry: A feature structure A is said to subsume a feature structure B if A is at least as informative as B. Subsumption is a reflexive, antisymmetric, and transitive relation between two feature structures.
extension
relationship between two feature structures (3.3.7) in which one is more general than the other
Note 1 to entry: A feature structure (3.3.7) F extends G if and only if G subsumes F.
Note 2 to entry: Converse of subsumption (3.3.12).
interpretation
〈feature structures〉 minimally informative (or equivalently, most general) extension (3.3.13) of a feature structure (3.3.7) that is consistent with a set of constraint (3.3.18) declared by a feature system declaration (3.3.25)
structure sharing
re-entrancy
relation between two or more feature (3.3.5) within a feature structure (3.3.7) that share a feature value (3.3.1)
admissible feature
appropriate feature
feature (3.3.5) for which any feature structure (3.3.7) of a given type (3.3.22) may bear a feature value (3.3.1)
Note 1 to entry: This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear a value for every admissible feature. This term does not imply that the feature is obligatory here.
admissibility constraint
feature admissibility constraint
specification of a set of admissible features (3.3.15) and admissible feature values (3.3.1.3) associated with a specific type (3.3.22)
built-in
non-user-defined element that may appear in place of a feature structure (3.3.7)
Note 1 to entry: A built-in can appear, for example, as a feature value (3.3.1).
Note 2 to entry: Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex built-ins are collections (3.3.2) and applications of the operators, i.e. alternation (3.3.31), negation (3.3.28) and merge (3.3.27).
constraint
unit of specification that identifies some collection of feature structures (3.3.7) as invalid
Note 1 to entry: All constraints are implicational in their syntactic form, although some are distinguished as admissibility constraints (3.3.16). See validity (3.3.21) and ISO 24610-2, 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.
Note 2 to entry: A feature structure that has not been so identified by any of the constraints in a feature system (3.3.24.1.1) is considered to be valid.
implicational constraint
constraint (3.3.18) of the form, “if G, then H,” where G and H are feature structure (3.3.7)
Note 1 to entry: This identifies any feature structure FF as invalid for which GG subsumes F, and yet F and H have no valid extension (3.3.13) in common. See subsumption (3.3.12) and ISO 24610-2, 8.5. Often used to refer to implicational constraints that are not also admissibility constraints (3.3.16).
boxed label
label (3.6.9) in box used in a matrix notation (3.3.8) to denote a feature value (3.3.1) shared by several features (3.3.5)
Note 1 to entry: The label may be any alphanumeric symbol.
well-formedness
syntactic conformity of a feature structure (3.3.7) representation to ISO 24610-1
validity
conformity of a typed feature structure (3.3.7.2) to the constraints (3.3.18) of a particular feature system (3.3.24.1.1)
type
name of a class of entities
Note 1 to entry: Feature structures (3.3.7) may be characterized by grouping them into certain classes. Types are used to name such classes.
subtype
type (3.3.22) to which another type confers its constraints (3.3.18) and admissible features (3.3.15)
supertype
base type
type (3.3.22) from which another type inherits constraints (3.3.18) and admissible features (3.3.15)
Note 1 to entry: s is a subtype of t if and only if t is a supertype of s. Every type is a subtype and supertype of itself.
atomic type
user-defined type (3.3.22) with no admissible features (3.3.15) declared or inherited
type declaration
structure that declares the supertypes (3.3.22.2), admissible features (3.3.15), admissible feature values (3.3.1.3), admissibility constraints (3.3.16) and implicational constraints (3.3.18.1) for a given type (3.3.22)
Note 1 to entry: The constraints on a type in the resulting feature system (3.3.24.1.1) are those that have been declared in its declaration, in addition to those that it has inherited from its supertypes.
partial order
partially ordered set
set S equipped with a relation ≤ over S × S that is (1) reflexive (for all s ∈ S, s ≤ s), (2) anti-symmetric (for all p, q ∈ S, if p ≤ q and q ≤ p, then p = q), and (3) transitive (for all p, q, r ∈ S, if p ≤ q and q ≤ r, then p ≤ r)
Note 1 to entry: The set of integers Z is partially ordered, but it has an additional property: for every p, q ∈ Z, either p ≤ q or q ≤ p. Not all partial orders have this property. The taxonomical classification of organisms into phyla, genera and species, for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structure (3.3.7.2) of a feature system (3.3.24.1.1) do not, unless (a) their type hierarchy (3.3.24.1) does, and (b) either the type hierarchy has exactly one type (3.3.22), or every type is constrained to have exactly one admissible feature (3.3.15).
type hierarchy
partial order (3.3.24) over a set of types (3.3.22)
Note 1 to entry: See ISO 24610-1, Annex C, Type inheritance hierarchies.
feature system
type hierarchy (3.3.24.1) in which each type (3.3.22) has been associated with a collection of admissibility constraints (3.3.16) and implicational constraints (3.3.18.1)
Note 1 to entry: Cf. type declaration (3.3.23).
feature system declaration
FSD
specification of a particular feature system (3.3.24.1.1)
semantic type
DEPRECATED: type
referring expression that distinguishes a collection of feature structures (3.3.7) as an identifiable and conceptually significant class
merge
generic operation that includes union (3.3.30) of sets or bags (3.3.3) and concatenation (3.3.32) of lists
negation
(unary) operation on a feature value (3.3.1) denoting any other value incompatible with it
unification
operation that combines two compatible feature structures (3.3.7) into the least informative feature structure that contains the information from the two
union
operation that combines two sets, or bags (3.3.3), into one
Note 1 to entry: The corresponding operation for lists is concatenation (3.3.32).
alternation
operation on feature values (3.3.1) that returns one and only one of the values supplied as its argument
Note 1 to entry: Given a feature specification F: a|b, where a|b denotes the alternation of a and b, F has either the value a or the value b, but not both.
concatenation
operation of combining two lists of feature values (3.3.1) into a single list
typing
assignment of a semantic type (3.3.26) to a built-in (3.3.17) or feature structure (3.3.7), either atomic or complex
Note 1 to entry: Semantic types in feature systems (3.3.24.1.1) are partially ordered, with multiple inheritance.
3.1.3 Morphosyntactic Annotation Framework (MAF)
FSA
finite state automata
graphs (3.5.5) made up of states with an initial state and a final state, and a finite set of transitions from state to state
Note 1 to entry: See also directed acyclic graph (3.4.2).
directed acyclic graph
DAG
digraph
graph (3.5.5) with directed edges (3.5.5.2) and no cycles
Note 1 to entry: DAGs are a subset of FSA (3.4.1).
morphosyntactic tag
label identifying a feature structure (3.3.7) used to qualify a word form (3.1.13) within an established taxonomy
Note 1 to entry: Morphosyntactic tags can be atomic labels (“N” for “noun”), but very often they are mnemonic representations for the feature structures that they identify (“NNL2” for “plural locative noun” in the CLAWS-7 tagset, see [10]). The relevant feature structures can also be encoded by character vectors, as in “N12201” for “common noun, feminine, plural, countable” in the EAGLES intermediate tagset (see [11]) or by agglutinated shorthand feature identifiers, as in “subst:pl:gen:m3” for “noun, plural, genitive, masculine, inanimate” in the NKJP tagset (see [12]).
morphosyntactic tagset
comprehensive set of morphosyntactic tags (3.4.3) used for the morpho-syntactic description of a language (3.16.1)
token
non-empty contiguous sequence of graphic character (3.1.33) in a document
Note 1 to entry: For editorial reasons, some annotation schemes extend the notion of token to an empty sequence.
tokenization
process that segments a language data stream into individual tokens (3.4.5)
script conversion
representing graphic characters (3.20) from a script (3.1.28) by the graphic characters of a target script, most commonly by romanization (3.4.8.1)
Note 1 to entry: The two basic methods of conversion of a system of writing are transliteration and transcription. The use of the terms “source script” and “target script” in transliteration is analogous to the terms “source language” and “target language” in translation.
[SOURCE: ISO 15919, 4.1, modified — “script” used as attribute of the main term.]
transliteration
representation of the graphic characters (3.1.33) of a source script (3.1.28) by the graphic characters of a target script
Note 1 to entry: In transcription, pronunciation conventions are of primary importance, while in transliteration, writing conventions are of primary importance.
[SOURCE: ISO 15919, 4.7]
romanization
conversion of non-Latin graphic characters (3.20) into Latin graphic characters, using either transliteration (3.4.8) or transcription (3.1.29)
word lattice
set of possible alternative decompositions of a text or speech segment into word forms (3.1.13)
Note 1 to entry: A word lattice has the algebraic properties of a directed acyclic graph (3.4.2) with an initial node (3.5.5.1) and a final node.
Note 2 to entry: See also DAG (3.4.2) and FSA (3.4.1).
3.1.4 Linguistic Annotation Framework (LAF)
original artefact
artefact or annotation (3.2.7) from which the primary data (3.2.4) is derived
annotation document
XML document (3.12.4.2) containing annotations (3.2.7)
region
〈linguistic annotation framework〉 area in the primary data (3.2.4) defined by a non-empty, ordered list of anchors (3.5.4)
anchor
fixed, immutable position in the primary data (3.2.4) being annotated (3.2.5)
Note 1 to entry: The medium determines how an anchor is described. For example, text (3.16.9) anchors may be character offsets, audio anchors may be time offsets, video anchors may be time offsets or frame indices, image anchors may be coordinates.
graph
set of nodes (3.5.5.1) (vertices) V(G) and a set of edges (3.5.5.2) E(G)
node
vertex
terminal point in a graph (3.5.5) G, or the intersection of edges (3.5.5.2) in G
edge
ordered pair of nodes (3.5.5.1) [u,v] from V(G)
Note 1 to entry: The order of the nodes determines the direction of the edge.
3.1.5 Syntactic Annotation Framework (SynAF)
syntactic graph
DEPRECATED: graph
connected set of syntactic nodes (3.6.1.2) and syntactic edges (3.6.1.3)
syntactic tree
syntactic graph (3.6.1) in which each syntactic node (3.6.1.2) has a single parent
syntactic node
DEPRECATED: node
word form (3.1.13) or constituent (3.6.2) seen as an elementary syntactic component of a syntactic analysis
terminal node
syntactic node (3.6.1.2) which is a single word form (3.1.13) or an empty element involved in a syntactic relation
non-terminal node
syntactic node (3.6.1.2) which is not a word form (3.1.13)
Note 1 to entry: A non-terminal node has an outgoing constituency syntactic edge (3.6.1.3).
syntactic edge
DEPRECATED: edge
triplet with a syntactic source node (3.6.1.2), a target node, and optional annotations (3.2.7)
Note 1 to entry: Non-terminal nodes (3.6.1.2.2) have an outgoing constituency syntactic edge.
constituent
syntactic grouping of words (3.1.9.1), phrases (3.1.25), or clauses (3.1.26) on the base of structural (or hierarchical) properties
Note 1 to entry: Words can be grouped into phrases, phrases into clauses or other phrases and clauses into sentences (3.1.27).
chunk
non-recursive constituent (3.6.2)
modifier
part of a constituent (3.6.2) which ascribes a property to the syntactic head (3.6.2.3) of the constituent
Note 1 to entry: A modifier can be placed before or after the head of the phrase (3.1.25) (pre-modifier or post-modifier). Modifiers are optional in a constituent.
syntactic head
DEPRECATED: head
part of a constituent (3.6.2) which determines its distribution and its grammatical properties
Note 1 to entry: The head of a constituent usually cannot be left out.
Note 2 to entry: Distribution here refers to the syntactic environments in which the constituent may appear.
Note 3 to entry: The syntactic head determines the grammatical properties of a constituent in such a way that if the grammatical gender of the head is feminine, then the gender of the entire constituent will be feminine.
grammatical function
grammatical role of a word form (3.1.13) or constituent (3.6.2) within its embedding syntactic environment
Note 1 to entry: For example, a noun phrase (NP) (3.1.25.2) can act as a subject within a sentence (3.1.27), or a noun may act as a subject dependent of a verb in a dependency graph. There is a grammatical relation between the subject – NP and the main verb in a sentence. All grammatical relations (subject – predicate, syntactic head (3.6.2.3) – modifier (3.6.2.2), etc.) are subsumed under the concept of dependency relations (3.6.4), whether between terminal nodes (3.6.1.2.1) or non-terminal nodes (3.6.1.2.2).
dependency relation
dependency
syntactic relation between word form (3.1.13) or constituent (3.6.2) on the basis of the grammatical functions (3.6.3) that constituents play in relation to each other
syntactic argument
one of the essential and functional constituents (3.6.2) in a clause (3.1.26) that identifies the participants in the process referred to by a lexeme (3.1.9)
EXAMPLE Alfred (syntactic argument) reads a book (syntactic argument) today (adjunct (3.6.6)).
adjunct
non-essential element associated with a verb as opposed to syntactic arguments (3.6.5)
Note 1 to entry: Adverbs are possible adjuncts for a sentence (3.1.27).
subcategorization frame
valency
valence
set of restrictions on a lexeme (3.1.9) indicating the properties of the syntactic arguments (3.6.5) that can or must occur with this given lexeme
domain
class of elements to which a certain set of labels (3.6.9) can be assigned
Note 1 to entry: Domains can refer generally to the set of all syntactic edges (3.6.1.3), terminal nodes (3.6.1.2.1) or non-terminal nodes (3.6.1.2.2).
label
unit of annotation (3.2.7) consisting of the name of a feature (3.3.5) and a feature value (3.3.1), which together can be applied to appropriate model elements and add arbitrary feature-value annotations to such elements
sequential representation
representation (3.2.8) of annotation content where the XML element (3.12.4.3) structure mirrors the sequence of linguistic objects in the primary source
3.1.6 Semantic Annotation Framework (SemAF)
3.1.7 Time and Events (SemAF-Time, ISO-TimeML)
event
eventuality
something that can be said to obtain or hold true, to happen or to occur
Note 1 to entry: This is a very broad notion of event that includes all kinds of actions, states, processes, etc. It is not to be confused with the narrower notion of event (as opposed to the notion of "state") as something that happens at a certain point in time (e.g. the clock striking two or waking up) or during a short period of time (e.g. laughing). In TimeML, the term “event” is used in a broader sense and is equivalent to the term “eventuality”.
tense
way that languages (3.16.1) express the time at which an event (3.7.1.1) described by a sentence (3.1.27) occurs
Note 1 to entry: This is characterized as a property of a verb form. Noun forms will not be said to exhibit tense but rather temporal markers.
temporal interval
period
uninterrupted stretch of time, with internal point structure
Note 1 to entry: Time is often viewed as a straight line from minus infinity to plus infinity. A temporal interval is a part of that line without any holes, containing all the points between its beginning (3.7.1.7.1) and its end (3.7.1.7.2).
Note 2 to entry: In mathematics, an important issue is whether an interval includes its beginning and its end (is “closed”) or not (is “open” or “half-open”). In natural language descriptions of intervals this may also be relevant, as when describing an interval in terms of a number of days, but not with the same granularity as in mathematics. Cf. [14].
[SOURCE: Adapted from [15].]
temporal ordering relation
relation that determines how objects are ordered in time
EXAMPLE Precedence, simultaneity.
Note 1 to entry: There is a limited number of ways to order objects which are collectively called ordering relations.
time amount
quantity (3.7.9.1) of time, measured by temporal unit (3.7.1.6) over temporal intervals (3.7.1.3)
Note 1 to entry: A time amount is a measure of time that can be expressed in terms of a number of temporal units, such as “half an hour” or “30 minutes”.
[SOURCE: Adapted from [16].]
temporal unit
element in a time amount (3.7.1.5) that quantifies the length of a temporal interval (3.7.1.3) or a set of temporal intervals
Note 1 to entry: In measurement systems, various units are defined for different purposes. Small units such as seconds and minutes are defined to measure small temporal intervals; as one may want to avoid working with big numbers, for larger temporal intervals, units such as week, year, decade, and century are defined.
Note 2 to entry: The amount of a temporal unit is called a measure (3.7.5.10).
[SOURCE: Adapted from [16].]
point of speech
temporal unit (3.7.1.6) at which a given utterance (3.7.2.2) occurs
Note 1 to entry: The notion of point of speech is needed in order to interpret tense (3.7.1.2). This requires the use of anchor points in time, of which the point of speech is one (point of text (3.7.1.7.5) is another one). For example, in “Arthur smiled”, the point of speech is the time that the utterance is made.
Note 2 to entry: For a document as a whole, this may be considered to be the same as the document creation time.
instant
point in time with no interior points
Note 1 to entry: Time is often viewed as a straight line from minus infinity to plus infinity. In this view, time is formed by an infinite sequence of points. An instant can also be seen as an infinitesimally small interval. Cf. [14] for "instant."
beginning
instant (3.7.1.7) at which a temporal interval (3.7.1.3) begins
[SOURCE: Adapted from [17].]
end
instant (3.7.1.7) at which a temporal interval (3.7.1.3) ends
[SOURCE: Adapted from [17].]
point of event
instant (3.7.1.7) at which the event (3.7.1.1) mentioned in a given utterance (3.7.2.2) occurs
Note 1 to entry: Next to a point of speech (3.7.1.6.1), a point of event also needs to be defined in order to interpret tense (3.7.1.2). For example, in “Arthur smiled”, the temporal location of the point of event can be defined as being prior to the point of speech.
point of reference
instant (3.7.1.7) of temporal perspective on the event (3.7.1.1) in a given utterance (3.7.2.2)
Note 1 to entry: “Arthur will have gone by tomorrow”, where the point of speech (3.7.1.6.1) is now, the point of event (3.7.1.7.3) is some time in the future, but before the point of reference referred to by “tomorrow”.
Note 2 to entry: To locate certain label (3.6.9) in time, a third anchor point is also required, defined as the point of reference (3.7.1.7.4).
point of text
instant (3.7.1.7) at which reported speech is anchored
Note 1 to entry: It is the point of time considered in the text (3.16.9) of the speech. So for example, when a person is telling a story, it is not enough to know the point of the speech itself (the document creation time), but the point at which the speech in the story is taking place.
markable
entity in general, or segment of a text (3.16.9) in particular, that is subject to an annotation (3.2.7)
ALINK
linking tag (3.7.5.17) that represents a phase relation between an aspectual verb (or morpheme (3.1.8)) and a predicate (3.7.3.2) denoting an event (3.7.1.1)
MLINK
linking tag (3.7.5.17) that represents the measurement of the duration of an event (3.7.1.1) or the measurement of the length of a (possibly discontinuous) time span
SLINK
linking tag (3.7.5.17) that represents a subordinating relation between two event (3.7.1.1)
TLINK
linking tag (3.7.5.17) that represents a temporal relation between two temporal entities: namely, between two event (3.7.1.1), two temporal expressions, or between a temporal expression and an event
Note 1 to entry: Some ordering relations cannot be expressed by an ordering relation between two events because a signal, like a temporal preposition, complicates the ordering or there is an ordering relation between a temporal signal and an event.
[SOURCE: Adapted from [18].]
3.1.8 Dialogue Acts
dialogue
exchange of utterance (3.7.2.2) between two or more persons or artificial agents
utterance
anything said, written, keyed, signed, or otherwise expressed, possibly in multimodal form
Note 1 to entry: An utterance is part of a turn unit (3.7.2.6). In the literature, the term is commonly used in the sense of ‘everything contributed by a sender within a turn unit’.
Note 2 to entry: The term ‘utterance’ is useful in the description of dialogue (3.7.2.1) behaviour, but is not of central importance in ISO 24617-2, since dialogue acts (3.7.2.7) are not assumed to correspond to utterances, but rather to the communicative behaviour in functional segment (3.7.2.7.3).
participant
person or artificial agent involved in a dialogue (3.7.2.1)
Note 1 to entry: Both entity (3.7.3.4) and event (3.7.1.1) can function as participants (3.7.2.3).
sender
participant (3.7.2.3) who performs a dialogue act (3.7.2.7)
speaker
sender (3.7.2.3.1) of a dialogue act (3.7.2.7) in spoken form
Note 1 to entry: A participant (3.7.2.3) can contribute to a dialogue without having the speaker role (3.7.2.4), for example by nodding in agreement to what the other participant says. Therefore, the term ‘speaker' is not synonymous with ‘participant who occupies speaker role'.
Note 2 to entry: A speaker possibly combines speech with nonverbal communicative behaviour.
addressee
participant (3.7.2.3) oriented to by the sender (3.7.2.3.1) in a manner to suggest that his/her utterance (3.7.2.2) are particularly intended for this participant, and that some response is therefore anticipated from this participant, more so than from the other participants
Note 1 to entry: This definition is a de facto standard in the linguistics literature.
[SOURCE: [20], modified — ‘speaker' replaced by ‘sender', and use of ambiguous pronouns avoided.]
speaker role
role occupied by a participant (3.7.2.3) who has temporary control of a dialogue (3.7.2.1) and speaks for some period of time
[SOURCE: DAMSL annotation scheme (see [21]).]
speech act
act that a speaker (3.7.2.3.1.1) performs when producing an utterance (3.7.2.2)
Note 1 to entry: The notion ‘utterance’ in this definition is commonly interpreted as mentioned in Note 1 to entry of 3.7.2.2.
[SOURCE: [22], modified — Note 1 to entry added.]
turn unit
stretch of communicative activity produced by one participant (3.7.2.3) who occupies the speaker role (3.7.2.4), bounded by periods of inactivity of that sender (3.7.2.3.1) or by periods where another participant occupies the speaker role
Note 1 to entry: The term ‘turn unit’ is also closely related to the term ‘turn construction unit’ (TCU), introduced by [23]. The TCU seems a rather intuitive and holistic notion, of which the usefulness has been the subject of debate (see e.g. [24]).
dialogue act
communicative activity of a participant (3.7.2.3), interpreted as having a certain communicative function (3.7.2.7.11) and semantic content (3.7.2.7.8)
Note 1 to entry: A dialogue act can additionally also have certain functional dependence relation (3.7.2.7.4), rhetorical relation (3.7.2.7.7) and feedback dependence relation (3.7.2.7.6) with other units in a dialogue.
feedback act
dialogue act (3.7.2.7) that provides or elicits information about the sender's (3.7.2.3.1) or the addressee (3.7.2.3.2) processing of something that was uttered in the dialogue (3.7.2.1)
Note 1 to entry: Two classes of feedback are distinguished: allo-feedback acts (3.7.2.7.1.1) and auto-feedback acts (3.7.2.7.1.2).
allo-feedback act
feedback act (3.7.2.7.1) where the sender (3.7.2.3.1) elicits information about the addressee's (3.7.2.3.2) processing of an utterance (3.7.2.2) that the sender contributed to the dialogue (3.7.2.1), or where the sender provides information about his perceived processing by the addressee of an utterance that the sender contributed to the dialogue
EXAMPLE | 1. A: Now move up. |
2. B: Slightly northeast you mean? | |
3. A: Slightly yeah | |
With utterance 3, A performs an allo-feedback act signalling that he/she thinks B understood utterance 1 correctly. | |
auto-feedback act
feedback act (3.7.2.7.1) where the sender (3.7.2.3.1) provides information about his/her own processing of an utterance (3.7.2.2) contributed to the dialogue (3.7.2.1) by another participant (3.7.2.3)
EXAMPLE B's utterance in the example dialogue fragment in 3.7.2.7.1.1 signals that he/she is uncertain whether he/she understood the previous utterance correctly.
dimension
class of dialogue acts (3.7.2.7) that are concerned with a particular aspect of communication, corresponding to a particular category of semantic content (3.7.2.7.8)
EXAMPLE 1 Dialogue acts advancing the task or activity that motivates the dialogue (3.7.2.1) (the ‘Task' dimension).
EXAMPLE 2 Dialogue acts providing and eliciting feedback (the auto- and allo-Feedback dimensions).
EXAMPLE 3 Dialogue acts for allocating the speaker role (3.7.2.4) (the turn management dimension).
functional segment
minimal stretch of communicative behaviour that has one or more communicative functions (3.7.2.7.11)
Note 1 to entry: The condition of being ‘minimal' ensures that functional segments do not include material that does not contribute to the expression of a communicative function that identifies the segment.
EXAMPLE The functional segment corresponding to the answer given by S in the following dialogue (3.7.2.1) fragment does not include the parts "Just a moment please" and “.... let me see..." but only the parts “the first train to the airport on Sunday morning is" and “at 5:45”.
- U: What time is the first train to the airport on Sunday morning please?
- S: Just a moment please... the first train to the airport on Sunday morning is .... let me see... at 5:45.
Note 2 to entry: A consequence of this definition is that functional segments can be discontinuous, can overlap or be embedded, and can contain parts from more than one turn.
functional dependence relation
relation between a dialogue act (3.7.2.7) with a responsive communicative function (3.7.2.7.11.1) and one or more previous dialogue acts that it responds to
EXAMPLE The relation between an answer and the corresponding question, such as between utterance 3 and utterance 2 in the example in 3.7.2.7.1.1; or the relation between the acceptance of an offer and the corresponding offer.
reference segment
stretch of communicative behaviour that a feedback dependence relation (3.7.2.7.6) refers to and that is not a functional segment (3.7.2.7.3)
feedback dependence relation
relation between a feedback act (3.7.2.7.1) and the stretch of communicative behaviour the processing of which the act provides or elicits information about
EXAMPLE In the example in 3.7.2.7.1.1, both the allo-feedback act (3.7.2.7.1.1) expressed by utterance 3 and the auto-feedback act (3.7.2.7.1.2) expressed by utterance 2 have a feedback dependence relation to utterance 1.
Note 1 to entry: Feedback dependence relations are also used to relate self-corrections, partner corrections, and other speech editing acts, which strictly speaking are not feedback acts, to the segments that they apply to.
rhetorical relation
DEPRECATED: discourse relation
semantic or pragmatic relation between two dialogue act (3.7.2.7) or their semantic content (3.7.2.7.8)
EXAMPLE 1 In the following example, the statement in the second utterance provides a motivation for the question in the first utterance:
A: Can you tell me what flights there are to Sydney on Saturday? I’d like to attend my mother's 80th birthday.
EXAMPLE 2 A rhetorical relation between the semantic contents of two dialogue act occurs in the following, where the content of B's statement mentions a cause for the content of A's statement:
A: I can never find these stupid remote controls.
B: That's because they don’t have a fixed location.
Note 1 to entry: Relations such as elaboration, explanation, justification, cause, and concession have been studied extensively in the analysis of (monologue) text (3.16.9), where they are often called ‘rhetorical relations' or ‘discourse relations', and are mostly viewed either as relations between text segments or as relations between events (3.7.1.1) or propositions, described in text segments. Many of these relations also occur in dialogue (3.7.2.1).
semantic content
information, situation (3.7.6.1), action, event (3.7.1.1), or objects (3.7.4.5) that a stretch of communicative behaviour refers to
semantic content category
semantic content type
type of the semantic content (3.7.2.7.8) of a dialogue act (3.7.2.7)
EXAMPLE The various dimensions (3.7.2.7.2) defined in this document correspond to categories of semantic content (3.7.2.7.8). In particular, the task dimension corresponds to the category of task-specific actions and information; the allo- and auto-feedback dimensions correspond to the categories of information about the processing by the current speaker (3.7.2.3.1.1) or by the addressee (3.7.2.3.2), respectively, of something that was said before; the turn management dimension corresponds to the category of information about the allocation of the speaker role (3.7.2.4), and so forth.
information state
context
totality of a participant's (3.7.2.3) attitudes that may influence the participant's interpretation and generation of communicative behaviour
Note 1 to entry: Attitudes include, among others, beliefs, assumptions, expectations, goals, preferences and hopes.
communicative function
property of certain stretches of communicative behaviour, describing how the behaviour changes the information state (3.7.2.7.10) of an understander of the behaviour
responsive communicative function
communicative function (3.7.2.7.11) of a dialogue act (3.7.2.7) that depends for its semantic content (3.7.2.7.8) on one or more dialogue acts that it responds to
qualifier
predicate (3.7.3.2) that can be associated with a communicative function (3.7.2.7.11)
EXAMPLE A: Would you like to have some coffee?
B: Only if you have it ready.
B's utterance accepts A's offer under a certain condition; this can be described by qualifying the communicative function Accept Offer with the predicate ‘conditional'.
3.1.9 Semantic Roles (SemAF-SR)
argument
formal semantic unit that is an essential element of a predicate argument structure (3.7.3.3) and can have variable instantiations depending on the utterance (3.7.2.2)
Note 1 to entry: An argument corresponds to a participant (3.7.2.3) of an event (3.7.1.1) described by the predicate argument structure.
Note 2 to entry: Arguments typically satisfy certain argument positions and can be described as being syntactico-semantic notions, whereas participants are semantico-conceptual. The standard view is that subsets of the participants associated with an event (3.7.1.1) are selected as arguments by the verb (or nominal or adjective) expressing the event. Other participants are either incorporated or realized as eventuality modifiers (3.7.3.6).
Note 3 to entry: Natural language predicates (3.7.3.2) typically have one, two, or three arguments, although they can have more.
predicate
formal semantic unit that represents a semantic relation between one or more arguments (3.7.3.1) in a predicate argument structure (3.7.3.3)
Note 1 to entry: Predicates are indicated by predicative linguistic elements such as verbs, nouns, and adjectives.
predicate argument structure
formal representation of the core semantic content (3.7.2.7.8) of an utterance (3.7.2.2), consisting of a predicate (3.7.3.2) constant, and its arguments (3.7.3.1)
Note 1 to entry: In classical logic-based semantics, this corresponds to predicate argument structures in first-order predicate logic.
Note 2 to entry: One of the arguments can be a variable uniquely identifying the instance of the predicate argument structure to allow references to it in other predicate argument structures.
Note 3 to entry: The representation of event semantics is subject to many variations; some of them, such as in [25], can have separate predicates for each semantic role (3.7.3.7) relation. In this case, the predicate argument structure of an utterance is the sum of the individual predicate semantic role assertions representing the semantic content of the utterance.
entity
conceptual semantic unit that typically functions as a participant (3.7.2.3)
Note 1 to entry: An entity is an individual such as a person, organization, physical object, or logical entity, as well as, on occasion, a number, quantity (3.7.9.1), dimension, or a reification of an event, a property, or a quality, e.g. emotion (anger, love), the value of a colour, etc.
Note 2 to entry: An entity is represented by a node (3.5.5.1) in a content structure.
eventuality frame
generalized abstract specification of the word sense (3.1.12) associated with an event (3.7.1.1) in an utterance (3.7.2.2)
Note 1 to entry: The frame consists of the specification of (a) a predicate (3.7.3.2) that can participate in a class hierarchy if such a hierarchy is specified, and (b) the arguments (3.7.3.1) that this predicate expects along with their semantic roles (3.7.3.7).
eventuality modifier
particular type of participant (3.7.2.3) that completes the description of an event (3.7.1.1) but is optional and not essential
Note 1 to entry: Eventuality modifiers are distinct from other types of participants in that they are used in supplying information that is typically more peripheral and more general, for example, situating the eventuality in time or space (3.7.5.3.1).
Note 2 to entry: In FrameNet, these would be peripheral frame elements and in PropBank, ArgM’s.
Note 3 to entry: Eventuality modifiers typically correspond to syntactic adjuncts.
semantic role
mode of involvement of a participant (3.7.2.3) in an event (3.7.1.1)
Note 1 to entry: Semantic roles for specific events are often associated with prototypical semantic relations, e.g. if John causes a breaking event, he is the agent; if he uses a hammer, it is the instrument; and someone who receives something is a recipient.
3.1.10 Discourse Structure (SemAF-DS)
discourse
process of communication, consisting of one or more sentences (3.1.27) or sentence fragments
Note 1 to entry: From an abstract viewpoint, data (e.g. words (3.1.9.1), phrases (3.1.25), sentences, and paragraphs) representing a communication process is regarded as a discourse. A discourse can be encoded in various media such as text (3.16.9), hypertext, audio, video, and their possible combinations.
discourse structure
structure of discourse (3.7.4.1), comprising segment structure, content structure, and possibly other types of structure
segment
〈semantic annotation framework〉 partial realization of discourse (3.7.4.1)
EXAMPLE Word (3.1.9.1), phrase (3.1.25), subordinate clause (3.1.26.2), sentence (3.1.27), paragraph, section, chapter.
Note 1 to entry: A synonym (3.16.23) is a ‘discourse segment’. A segment references a semantic and/or pragmatic entity, which can be a semantic/pragmatic relation. Intrasentential segments are syntactic constituents such as words, phrases, and clauses. Segments might or might not be continuous: this is discussed in the definition of connectives.
circumstance
entity (3.7.3.4) which is an event (3.7.1.1) (including dialogue act (3.7.2.7)), state, process (3.18.2.11), relation, proposition, or set of these
object
semantic entity (3.7.3.4) other than circumstance (3.7.4.4)
Note 1 to entry: Objects include people, buildings, machines, ideas, and rules.
class
unary predicate, which is a set of entities (3.7.3.4)
relational class
class (3.7.4.6) whose instances are circumstances (3.7.4.4) equivalent to relations
3.1.11 Spatial Information
region
〈semantic annotation framework〉 connected, non-empty point-set defined by a domain and its boundary points
Note 1 to entry: The term "region" as defined does not refer to a political or administrative region such as "the Canary Islands" or "Hong Kong, SAR", where SAR is the acronym of “Special Administrative Region”.
place
geographic or administrative entity that is situated at a location (3.7.5.3)
location
point or finite area that is positioned within a space (3.7.5.3.1) or a series of such points or areas
Note 1 to entry: Places (3.7.5.2), paths (3.7.5.3.2), and event-paths (3.7.5.3.2.1) are subtypes of locations.
space
dimensional extent (3.7.5.16) in which objects (3.7.4.5) and events (3.7.1.1) have a relative position and direction
path
static path
route
〈semantic annotation framework〉 location (3.7.5.3) that consists of a series of locations
Note 1 to entry: A spatial object path is a location where the focus is on the potential for traversal or which functions as a boundary. This includes common nouns like road, coastline, and river and proper names like Route 66 and KangamangusHighway. Some nouns, such as valley, can be ambiguous. It can be understood as a path in we walked down the valley or as a place (3.7.5.2) in we live in the valley.
Note 2 to entry: A path might be represented as an undirected graph whose nodes (3.5.5.1) are locations and whose edges (3.5.5.2) signify continuity; i.e., unlike an event-path (3.7.5.3.2.1), a path has no inherent directionality.
event-path
dynamic path
trajectory
dynamic route
directed path (3.7.5.3.2) followed by a mover (3.7.5.13) and coincident with a motion (3.7.5.12)
Note 1 to entry: Unlike (static) paths such as roads or circular tracks, event-paths are each triggered by a specific motion, characterized as being finite directed paths each with a start and an end.
document creation location
dcl
unique place (3.7.5.2) or set of places associated with a document that represents the location (3.7.5.3) in which the document was created
Note 1 to entry: Some collaboratively written documents, such as GoogleDoc[1]1) documents and chat logs, might refer not only to a single location but also to a set of locations spread out across the world. Besides, for example, the creation place of the Hebrew bible or the creation place of each of the books in it is uncertain. The attribute @dcl will, therefore, have the value "false", understood to mean "unspecified", while the value "true", is understood to mean"specified".
qualitative spatial relation
topological link
abstract static relation between regions (3.7.5.1) or spaces (3.7.5.3.1), expressing their connectedness or continuity
non-locational spatial entity
DEPRECATED: spatial entity, non-locational
object (3.7.4.5) that is situated at a unique location (3.7.5.3) for some period of time, and typically has the potential to undergo translocation
Note 1 to entry: A non-locational spatial entity, tagged <entity>, as defined, is distinct from genuine spatial entities that consist of three types of locational entities, places (3.7.5.2), paths (3.7.5.3.2), and event-paths (3.7.5.3.2.1). It is an object that participates in a spatial or motional relation. In John is sitting in a car, both John and car could be understood as spatial entities or as being the figure (3.7.5.7) and the ground (3.7.5.8), respectively, of the sitting-in situation.
figure
entity (3.7.3.4) that is considered the focal object (3.7.4.5), which is related to some reference object
ground
landmark
entity (3.7.3.4) that acts as reference for a figure (3.7.5.7)
Note 1 to entry: “landmark” is often used by cognitive semanticists.
orientational relation
orientation relation
directional relation
link that relates one location (3.7.5.3) as a figure (3.7.5.7) to another location as a ground (3.7.5.8) that expresses the spatial disposition or direction of a spatial object within a frame of reference
measure
magnitude of a spatial dimension or relation
measure relation
link that relates a measure (3.7.5.10) to an object (3.7.4.5) that is being measured
Note 1 to entry: The bounds of a measured object are sometimes specified for a measure relation. They can be points or areas like a city, or lines like a river or mountain range.
motion
motion-event
action or process involving the translocation of a spatial object, transformation of some spatial property of an object (3.7.4.5), or change in the conformation of an object
Note 1 to entry: A motion is a particular kind of event (3.7.1.1).
mover
moving object
entity (3.7.3.4) that undergoes a change of its location (3.7.5.3)
Note 1 to entry: A mover can either be the agent of a motion (3.7.5.12) as one who walked to the station or one that is simply caused to move like a stone thrown into a well, while the thrower is not considered to be the mover in the sense of the term defined.
movement relation
link that relates a mover (3.7.5.13) to an event-path (3.7.5.3.2.1) which the mover traverses
Note 1 to entry: A movement relation is triggered by a motion (3.7.5.12).
spatial relation
segment or series of segments of a text (3.16.9) that rebounds to qualitative spatial relations (3.7.5.5) or orientational relations (3.7.5.9), or to movement relations (3.7.5.14) indirectly through the specification of the bounds of paths (3.7.5.3.2) or event-paths (3.7.5.3.2.1)
extent
textual segment that is a string of character segments in text (3.16.9) that is being annotated (3.2.5)
EXAMPLE Tokens (3.4.5), words (3.1.9.1), and non-contiguous phrases (3.1.25) (e.g. a complex verb like "look ... up").
tag
element name
name associated with textual segments for annotation (3.2.7) or for a relation between these segments
Note 1 to entry: The following are three kinds of tag for annotation:
- extent tag, which is associated with textual segments referring to basic entities (3.7.3.4) or signals;
- link tag, for representing spatial relations (3.7.5.15); and
- root tag, for the closure of annotations (3.2.7).
non-consuming tag
non-consuming tag (3.7.5.17.1) that has no associated extent (3.7.5.16)
EXAMPLE In an example, John ate an apple but Mary a pear, there are at least two ways of marking up the <event> tag, one with its extent or target filled in with a nonnull string of characters, or audio or visual elements, and the other with an empty string:
- John atee1 an apple, but Mary ∅e2 a pear;
- <event xml:id="e1" target="ate"/>
- <event xml:id="e2" target=" "/> (non-consuming <event> tag)
Note 1 to entry: The extent of a non-consuming tag is a null string.
3.1.12 Semantic Relations in Discourse, Core Annotation Schema (DR-core)
situation
event (3.7.1.1), fact, proposition, condition, belief or dialogue act (3.7.2.7), that can be realized by a linguistically simple or complex expression
Note 1 to entry: An expression can be, among others, a clause (3.1.26), a nominalization, a sentence (3.1.27)/ utterance (3.7.2.2), or a discourse segment consisting of multiple sentences or utterances.
discourse connective
word (3.1.9.1) or multiword expression (3.1.9.2) expressing a discourse relation (3.7.6.3)
EXAMPLE Single-word discourse connectives include “but”, “since”, “and”, “however”, “because”. Multi-word discourse connectives include “as well as”, “such as”.
Note 1 to entry: Many of the words that can be used as discourse connectives can also be used as intra-clausal conjunctions, as with the use of “and” in “John and Mary are a lovely couple”.
discourse relation
relation between two situations (3.7.6.1) mentioned in a discourse (3.7.4.1)
EXAMPLE 1 “Peter came late to the meeting. He had been in a traffic jam.” The events mentioned in the two sentences are implicitly related through the discourse relation Cause.
EXAMPLE 2 “Peter was in a traffic jam, but he arrived on time for the meeting.” The events mentioned in the two clauses are related by the discourse relation Concession, expressed by the connective “but”.
EXAMPLE 3 “Peter did not manage to come to the meeting; he was held up in a terrible traffic jam.” The causal relation in this example is the same as in Example 1, but the argument expressed by the first clause is not an eventuality, but a proposition, formed by an event description with negative polarity.
Note 1 to entry: Quasi-synonyms for “discourse relation”, with small variations in meaning, are “coherence relation” and “rhetorical relation (3.7.2.7.7)”.
low-level discourse structure
representation of discourse structure (3.7.4.2) that only specifies local dependencies between a discourse relation (3.7.6.3) and its arguments, without further specifying any links or dependencies across these local structures
3.1.13 Reference Annotation Framework (RAF)
communicative segment
elementary portion of a multimodal interaction
referring expression
communicative segment (3.7.7.1) that specifically designates an entity (3.7.3.4) or an event (3.7.1.1), whether concrete or abstract, discourse (3.7.4.1) new or old, real or fictional
referent
discourse entity
extra-linguistic entity (3.7.3.4) which is denoted, or pointed out, by a communicative segment (3.7.7.1)
Note 1 to entry: “discourse entity” is used preferably in the context of the description of the concrete syntax whereas “referent” is used in the abstract syntax, but also when the underlying process is implied by the expression.
reference
〈semantic annotation〉 relation between a referring expression (3.7.7.1.1) and a referent (3.7.7.2) denoted by it
Note 1 to entry: The verb “to refer to” expresses such a relation: if there is a reference relation between an expression x and a discourse entity e, then x is said to refer to e.
anaphor
linguistic mechanism by which the interpretation of a referring expression (3.7.7.1.1) depends on another expression mentioned in the same text (3.16.9) or discourse (3.7.4.1)
Note 1 to entry: The notion of anaphora is more general than that of coreference (3.7.7.5): the interpretation of anaphora is context-dependent, whereas coreference is determined rather rigidly independently of its possible use of context (see [26]).
coreference
identity of referents (3.7.7.2) of two referring expressions (3.7.7.1.1)
objectal relation
relation between two referents (3.7.7.2) reflecting their intended association from a referential point of view
Note 1 to entry: The referential association can identify that they are identical, disjoint, or overlapping, or that one includes the other (see [27] and [26]).
3.1.14 Visual Information
voxicon
lexicon (3.2.1.1) or list of basic visual object concepts of VoxML (visual object concept structure modelling language)
voxeme
basic entries in voxicon (3.7.8.1)
minimal embedding space
MES
three-dimensional (3D) region (3.7.5.1) within which the state is configured, or the event (3.7.1.1) unfolds
habitat
representation of an object situated within a partial minimal model
qualia
QS
qualia structure
relational forces or aspects of a lexical item (3.2.3) or concept (3.12.1.3)
telic
purpose or function qualia (3.7.8.5) of an object (3.7.4.5)
affordance
affordance structure
set of specific actions, described along with the requisite conditions, that the object (3.7.4.5) can take part in
Gibsonian affordance
GA
set of specific actions that an agent can perform with an object (3.7.4.5) that is presented to the agent
EXAMPLE Hold, grasp, move.
telic affordance
set of goal-oriented or intentionally situated actions of an agent on an object (3.7.4.5) presented to the agent
EXAMPLE An agent eating an apple when it is presented to the agent.
3.1.15 Measurable Quantitative Information (MQI)
quantity
property of a measurable object (3.7.4.5) referring to its magnitude or multitude
base quantity
quantity (3.7.9.1) in a conventionally chosen subset of a given system of quantities, where no quantity in the subset can be expressed in terms of the other quantities within that subset
Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of Quantities (ISQ).
derived quantity
quantity (3.7.9.1) in a system of quantities, defined in terms of the base quantity (3.7.9.1.1) of that system
EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT-1), where length (L) and time (T) are base quantities.
[SOURCE: ISO/IEC Guide 99, 1.5, modified — Example replaced.]
quantitative information
QI
measurement associated with the quantity (3.7.9.1) of a measurable object
measurable quantitative information
MQI
quantitative information (3.7.9.2) that can be expressed in unitized numeric terms
quantitative markup language
QML
measurable quantitative information markup language
markup language of measurable quantitative information
specification language for the annotation (3.2.7) of measurable quantitative information (3.7.9.2.1) extractable from text (3.16.9) or other medium types of language (3.16.1)
measurement unit
unit of measurement
unit
scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative values expressed in real numbers
Note 1 to entry: The expressions that are used in measurement such as “metre”, “litre”, and “µmol/kg” are units by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are accepted as units by convention or agreement in some communities.
base unit
measurement unit (3.7.9.4) that is adopted by convention for a base quantity (3.7.9.1.1)
Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with seven ISQ base quantities to measure quantities, as shown in Table 1.
SI base unit (unit symbol) | Associated ISQ base quantity (base quantity symbol) |
|---|---|
metre (m) | length (L) |
kilogram (kg) | mass (M) |
second (s) | time (T) |
ampere (A) | electric current (I) |
kelvin (K) | thermodynamic temperature (È) |
mole (mol) | amount of substance (N) |
candela (cd) | luminous intensity (J) |
derived unit
measurement unit (3.7.9.4) for a derived quantity (3.7.9.1.2)
EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be “mass times acceleration” (MLT-2), where the quantity (3.7.9.1) “acceleration” is a derived quantity defined by “velocity divided by time” (VT-1) and “velocity” defined by “length (distance) divided by time” (LT-1).
Note 1 to entry: Table 2 illustrates some of the derived units.
Derived unit (unit symbol) | Associated derived quantity |
|---|---|
kilometre per minute(km/min) | speed = length(L)/ time(T) |
gram per cubic metre (gram/m3) | density = mass(M)/volume(L3) |
kilogram metre per square second (kg x m/s2) | force = mass (M) x length(L)/time(T2) |
lumen per square metre (lm/m2) | Illuminance = luminous intensity (J)/area(M2) |
3.1.16 Quantification
event set
aspect of a quantification (3.7.10.8), specifying a set of events (3.7.1.1) in which the members of a certain participant set (3.7.10.1.1) are involved
participant set
set of entities (3.7.3.4) involved in the event set (3.7.10.1) of a quantification (3.7.10.8)
EXAMPLE The parents gave all the teachers a present.
definiteness
language-dependent morphosyntactic feature (3.1.15) of a noun phrase (NP) (3.1.25.2), marked in English and other European languages (3.16.1) by a definite or indefinite article or a nominal suffix, by a demonstrative, or by a possessive expression
Note 1 to entry: The definiteness feature has two possible values: “definite” and “indefinite”. Being definite is often regarded as an indication of determinacy (3.7.10.4), indefinite as an indication of indeterminacy.
Note 2 to entry: In some languages it is only possible to express that an NP is definite (NPs are by default indefinite) or to express that an NP is indefinite (NPs are by default definite).
EXAMPLE al (definite article in Arabic languages), -e (suffix as definite article in Farsi), el/la (definite article in Spanish), a/az (definite article in Hungarian, there is no indefinite article), yī (occasionally indefinite article in Chinese; there is no definite article and the definiteness is definite unless an indefinite article or the context indicates otherwise).
Note 3 to entry: For overviews of definite expressions, see [29] and [30].
definite description
singular noun phrase (3.1.25.2) with definiteness (3.7.10.2) ‘definite’, interpreted as referring to a (contextually) uniquely determined entity (3.7.3.4)
EXAMPLE Jimmy, the chairperson, my house, this idea.
determinacy
semantic property of referring to some particular and determinate entity or collection of entities (3.7.3.4)
Note 1 to entry: Determinacy can be interpreted as specifying the relation between the reference domain (3.7.10.10) and the source domain (3.7.10.11) of a quantification (3.7.10.8). The reference domain of a determinate quantification is a proper subset of the source domain; for an indeterminate quantification the reference domain coincides with the source domain.
Note 2 to entry: Determinacy and definiteness (3.7.10.2) are not always clearly distinguished in the linguistic literature. For a discussion of this issue, see [31].
distributivity
distribution
specification of whether the entities (3.7.3.4) of the reference domain (3.7.10.10) of a quantification (3.7.10.8) are individually involved, or as a group (collectively), or as a mixture of the two
Note 1 to entry: Distributivity can be expressed by adverbs, such as “together”, “ensemble” (French) and “samen” (Dutch), or by certain determiners, such as “each” in English, “chaque” in French and “jeder” in German. Some determiners, such as the English “each”, “all” and “both” can also be used as adverbs.
exhaustivity
semantic property of a quantification (3.7.10.8), indicating that no other individuals than the elements of the participant set (3.7.10.1.1) are involved in elements of the event set (3.7.10.1)
genericity
specification of whether the sentence in which a quantification (3.7.10.8) occurs refers to a certain specific event set (3.7.10.1) and participant set (3.7.10.1.1) or expresses a general statement or question
quantification
application of a predicate to a set of entities (3.7.3.4)
Note 1 to entry: A particularly important type of predicate in the context of this document is involved in certain events in a certain semantic role.
individuation
semantic property of the way a nominal expression is used to refer to its denotation as a collection (3.3.2) of individual entities (3.7.3.4), as parts of a homogeneous mass, or as a collection of individual entities and their parts
Note 1 to entry: The distinction between referring to a collection of entities and referring to a part-whole structured domain is expressed in many languages by the distinction between count terms and mass terms (3.7.10.12).
reference domain
contextually determined set of entities (3.7.3.4) to which a quantifying predicate (3.7.3.2) is applied
source domain
explicitly mentioned maximal set of entities (3.7.3.4) to which a quantifying predicate (3.7.3.2) is applicable
Note 1 to entry: For a quantifier expressed by a noun phrase (3.1.25.2), the source domain is the extension of the restrictor (3.1.25.2.2). Adverbial temporal and spatial quantifiers have their source domains (temporal and spatial entities), specified as part of their lexical semantics.
mass term
noun or nominal compound used in such a way that it does not individuate its reference (3.7.7.3)
Note 1 to entry: Typical examples in English are “footwear”, “water”, “cattle”, “music”, “luggage” and “furniture”. By contrast, expressions such as “shoe”, “drop of water”, “cow”, “sonata”, “suitcase” and “chair” are typically used as count terms, i.e. in such a way that it is understood what counts as (for example) one shoe, as two shoes, etc. Some words are commonly used either way, such as “rope” and “stone”. The two possible uses of nouns are also illustrated by: “There’s no chicken in the pen”/“There’s no chicken in the stew.” See also [16].
inverse linking
modification of a noun phrase head (3.1.25.2.1) that contains a quantifier with wider scope than the quantification (3.7.10.8) of the noun phrase head
EXAMPLE Two students from every university participated in the meeting.
3.1.17 Spatial Semantics
annotation structure
information structure created by marking up some linguistic expressions with relevant (semantic) information
Note 1 to entry: ISO 24617-7, for instance, creates such annotation structures by marking up place names or motions and their spatial relations with relevant spatial information.
semantic form
logical form
representation of the semantic content (3.7.2.7.8) of an annotation structure (3.7.11.1) of expressions in natural language
Note 1 to entry: The semantic form of an annotation structure a is represented by σ(a), where σ is a function that maps an annotation structure a to a semantic form that carries the semantic content of a.
Note 2 to entry: Semantic forms are often called “logical forms” because semantic forms are represented by a logical language such as first-order logic (3.7.11.4).
interpretation
〈spatial semantics〉 function that maps a semantic form (3.7.11.2) to its denotation
Note 1 to entry: The interpretation function is represented by ⟦ ⟧ and, for each semantic form a, its denotation or the value of the interpretation, is represented by ⟦σ(a)⟧.
Note 2 to entry: In a model-theoretic semantics, the interpretation function ⟦ ⟧ is constrained by a model M (3.7.11.5) and, for each semantic form a and a model M, such an interpretation is represented by ⟦σ(a)⟧M.
first-order logic
formal language (3.16.1), artificially built for reasoning, with the values of its terms, particularly variables, ranging over individual objects (3.7.4.5) only
Note 1 to entry: Second-order variables such as P, which ranges over properties of an individual, are temporarily introduced to allow the λ-operation in the process of deriving semantic forms (3.7.11.2).
model M
set-theoretical construct that represents part of the real or possible world denoted by semantic form (3.7.11.2)
eigenplace
eigenspace
region (3.7.5.1) or path (3.7.5.3.2) occupied by an object (3.7.4.5)
Note 1 to entry: A region may be considered as a particular finite path matching to an interval [x,x] such that its start and endpoint match or are identical. In that case, a region is considered as a point.
3.1.18 Measurable Quantitative Information Extraction (MQIE)
information extraction
IE
process of identifying specific structured information from natural language (3.16.1.1), semi-structured text (3.16.9) and/or other electronic text sources
measurable quantitative information extraction
MQIE
process of identifying measurable quantitative information (3.7.9.2.1) from natural language, semi-structured text (3.16.9) and/or other electronic text sources
normalization
process that represents objective information with a formal and/or regular format or converts the information into a consistent value range
Note 1 to entry: The normalization objectives may contain information of entities (3.7.3.4), measure units and quantities (3.7.9.1).
3.1.19 Metamodel
metamodel
schematic representation of the concepts (3.12.1.3) that are used in the analysis and description of the phenomena covered in annotation (3.2.7) and of the relationships between them
3.2 Comprehensive Annotation Framework (ComAF)
segment
〈comprehensive annotation framework〉 referenceable part of a Diagrammatic Semantic Authoring (DSA)-based document, which is either a graph segment or a data segment (text (3.16.9), image, audio, video, etc.)
hypernode
node (3.5.5.1) which is a graph segment
semantic authoring
composition of documents while making their logical structures explicit
3.2.1 Lexical Markup Framework (LMF)
natural language processing
NLP
computer science field covering knowledge and techniques involved in the processing and analysis of linguistic data by a computer
data category
DC
class of data items that are closely related from a formal or semantic point of view
EXAMPLE /part of speech/, /subject field/, /definition/.
Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.
Note 2 to entry: In running text (3.16.9), such as this document, data category names are enclosed in forward slashes (e.g. /part of speech/).
[SOURCE: ISO 30042, 3.8, modified — admitted term “DC” added.]
grammatical feature
property associated with a word form (3.1.13) to describe one of its grammatical attributes
EXAMPLE grammaticalGender.
orthography
systematic way of spelling or writing lexeme (3.1.9) that conforms to a conventionalized use
Note 1 to entry: Usually, the notion of orthography covers standardized spellings of alphabetic languages, such as standard UK or US English, or reformed German spelling, as well as hieroglyphic or syllabic writing systems.
onomasiology
approach to the investigation of word meaning which takes a given concept (3.12.1.3) as a starting point and studies the different lexical items (3.2.3) in a language (3.16.1) or languages that are used to refer to it
etymology
origin and historical development of any aspect of a given lexical item (3.2.3)
etymologizable
meeting the conditions for having an etymology (3.9.6)
Note 1 to entry: "Etymologizable" is a category of lexical elements and usages (encompassing for instance lexical entries (3.2.2), word senses (3.1.12), word forms (3.1.13)).
etymon
Lexical entry (3.2.2) from which another lexical entry is derived
Note 1 to entry: An etymon can also be simply an earlier stage of a lexical item (3.2.3).
cognate
form in a related language (3.16.1) which shares a common etymological origin as a form in the language of the lexicon (3.2.1.1)
syntactic behaviour
one of the possible alternations that a lexeme (3.1.9) can show, at the syntactic level
EXAMPLE A verb can have different types of syntactic behaviours for subcategorization frame (3.6.7) alternations, such as the active voice, the passive voice, reflexive, etc.
Note 1 to entry: A syntactic behaviour is described in terms of subcategorization frames ([34], [35]).
semantic argument
formal semantic unit that is an essential constituent of a predicate-argument structure and can have variable instantiations depending on the utterance (3.7.2.2)
semantic predicate
formal semantic unit that represents a semantic relation between one or more semantic predicates (3.9.12) in a predicate-argument structure
3.2.2 Multilingual Information Framework
adornment
data category (3.9.2) attached to a component of a metamodel (3.7.13.1)
inline code
inline instructions inserted in a source document
Note 1 to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes).
subtitle
textual versions of the dialog in films, television programs, video games, etc.
Note 1 to entry: Subtitles are usually displayed at the bottom of the screen.
working language
language (3.16.1) in which linguistic sequences are expressed
3.2.3 Persistent Identification and Sustainable Access (PISA)
3.2.4 Resources
resource
〈persistent identification〉 digital object on the web with a specific identity that can be addressed with a Uniform Resource Identifier (3.11.2.1.2)
Note 1 to entry: A resource can have several representations. Depending on the PID framework (3.11.2.2), identification of a specific representation can be encoded in the identifier (3.11.2.1) or be left to the content negotiating process ([36]) between the web client (3.11.3.5) that uses the resolved PID (3.11.2.1.1) to fetch the resource and the resource server (3.11.3.2).
[SOURCE: Adapted from IETF RFC 3986.]
language resource
resource (3.11.1.1) that provides information about one or more languages (3.16.1)
Note 1 to entry: Language resources cover lexicographical, terminological, morpho-syntactical, corpus-related, or semantic resources or digital resources used to study linguistic phenomena like texts (3.16.9) and multimedia/multimodal recordings. They are created and used by linguists, information specialists, lexicographers and terminologists, among others. They frequently comprise many small records (3.12.2.2) compiled within a larger work, and are often authoritative in nature, such as standardized terminologies and glossaries issued by standards bodies such as ISO, IETF, W3C, etc.
complex resource
resource (3.11.1.1) consisting of multiple constituent parts, each of which can be accessed individually
Note 1 to entry: A complex resource can be a federated resource if its constituent parts are distributed over different digital repositories (3.11.1.3).
abstract resource
non-network-retrievable resource (3.11.1.1) identified by a Uniform Resource Identifier (3.11.2.1.2)
Note 1 to entry: It is practice, for example in RDFS (RDF Schema) or OWL (web ontology language) ontologies, to identify abstract resources using Uniform Resource Identifiers (URIs) (3.11.2.1.2). Web architecture does not require any information resource to be retrievable with this kind of URI. If an identifier (3.11.2.1) for an abstract resource is not meant to be dereferenced (3.11.4.3), such as can be the case with an XML namespace (3.12.4.5) URI, it is not meaningful to issue a PID (3.11.2.1.1) for this resource.
Note 2 to entry: Abstract resources are usually concepts such as a class or property.
version
particular form or variation of a resource (3.11.1.1) that differs from other instantiations of the resource in at least one aspect or item of information
Note 1 to entry: Versions are often identified in sequential order (e.g. Version 1, 2, etc.), but version identification of dynamic resources subject to frequent change is often achieved by assigning a date-time stamp.
digital repository
repository
facility that provides reliable access to managed digital resources (3.11.1.1)
digital archive
archive
digital archive (3.11.1.3.1) dedicated to the long-term preservation of its associated data
Note 1 to entry: Often the data in digital archives are also available online, which highlights the need for reliable persistent identifiers (PIDs) (3.11.2.1.1).
collection
〈resource identification and access〉 process of grouping of any number of resources (3.11.1.1) that need to be referenced as a whole
published collection
purposefully built collection (3.11.1.4) of resources (3.11.1.1) that is maintained as an independent entity by a digital archive (3.11.1.3.1) or digital repository (3.11.1.3) and for which adequate citation (3.11.1.9) information is available
resource collection incarnation
incarnation
virtual embodiment of a disparate, otherwise non-aggregated collection (3.11.1.4) assembled for a specific purpose that is referenced by a single PID (3.11.2.1.1) concatenated with a resource part identifier (3.11.2.1.4) in order to access the components of the collection
Note 1 to entry: A bibliography or index can use a single PID together with extensions to provide access to components in a set of resources (3.11.1.1) used in the production of a monograph or project without actually collecting the physical files in one location, which is to say that the individual items remain in their original locations, but are referenced as parts of a virtual whole.
resource part
part
identifiable, accessible entity embedded in an independent resource (3.11.1.1) or in a larger part thereof
Note 1 to entry: Parts can be embedded in other parts. In dynamic web environments, subsetting into parts is subject to change and interpretation, which requires a certain level of user decision-making to designate and identify such sub-entities.
terminal part
resource part (3.11.1.6) that is not subdivided into smaller parts
internal part
resource part (3.11.1.6) that is both embedded in the resource (3.11.1.1) and subdivided into smaller parts
fragment
some portion or subset of a primary resource (3.11.1.1), some view on representations of the primary resource, or some other resource defined or described as a component of the resource defined or described by those representations
[SOURCE: Adapted from IETF RFC 3986.]
snapshot
instantaneous copy of a resource (3.11.1.1) representing the status of the resource or collection (3.11.1.4) at a single point in time
citation
information object containing information that directs a reader's or user's attention from one resource (3.11.1.1) to another
reference
〈resource identification and access〉 digital object (3.7.4.5) that links to data stored elsewhere
Note 1 to entry: Although citation (3.11.1.9) and reference are commonly used as near-synonyms, citations provide information for human readers and users, while references include the precise location where the referenced resource (3.11.1.1) can be found. References can be machine-readable, and can be configured as actionable given the required criteria.
annotation tier
separate information layer containing comments, notes, explanations, or other types of external remarks that can be attached to a resource (3.11.1.1)
Note 1 to entry: For instance, maps or images can be annotated (3.2.5) with supplemental information, or text corpora can be annotated in either in-line or standoff mode.
3.2.5 Identifiers
identifier
digital identifier
compact sequence of characters associated with digital, non-digital, or abstract entities
Note 1 to entry: Identifiers can apply to entities such as books, images, reports, metadata records (3.12.2.2.1), and events.
PID
persistent identifier
unique identifier (3.11.2.1) that ensures permanent access for a digital object by providing access to it independently of its physical location or current ownership
Note 1 to entry: Unique in this context means that the PID will not be issued again for other resources (3.11.1.1). However, the same PID can reference different representations or resource collection incarnations (3.11.1.5) at the discretion of the resource provider (3.11.3.1).
Uniform Resource Identifier
URI
sequence of characters that identifies a resource (3.11.1.1)
Note 1 to entry: IETF RFC 3986 defines the generic URI syntax and a process for resolving (3.11.4.1) URI references (3.11.1.10) that might be in relative form, along with guidelines (3.18.1.8) and security considerations for the use of URIs on the Internet.
actionable identifier
Uniform Resource Identifier (URI) (3.11.2.1.2) that has a resource-associated identifier (3.11.2.1) that is suitably encoded, such that when the URI is embedded in a web document and “clicked” on, the browser will be redirected to the resource (3.11.1.1), and possibly supplementary services related to the resource
Note 1 to entry: This functionality implies that the URI points to a suitable resolver proxy (3.11.3.8).
Note 2 to entry: In some PID framework (3.11.2.2), the PIDs (3.11.2.1.1) are URIs and are automatically actionable.
fragment identifier
identifier (3.11.2.1) used to reference a resource part (3.11.1.6) in a web context
Note 1 to entry: A fragment identifier component as defined in IETF RFC 3986 is indicated by the presence of a number sign (“#”) character and terminated by the end of the URI (3.11.2.1.2). Fragments (3.11.1.7) in the sense of this RFC are resolved (3.11.4.1) and retrieved from the resource (3.11.1.1) by the local client application (3.11.3.4).
Note 2 to entry: There is a W3C draft proposal to change this handling of fragments ([38]).
[SOURCE: Adapted from IETF RFC 3986.]
resource part identifier
string of characters that refers to a resource part (3.11.1.6), and which can be identified by some means within a given resource type
EXAMPLE Such means are time for a media file, area for an image or record (3.12.2.2) in a data stream.
PID framework
scheme for specifying identifier strings [PID (3.11.2.1.1) scheme] for web-accessible digital objects together with a mechanism that enables the resolution of these identifiers (3.11.2.1) into the object's current Uniform Resource Identifiers (3.11.2.1.2)
Note 1 to entry: A PID framework in the sense of this International Standard facilitates access to both individual objects and to resource parts (3.11.1.6) and fragments (3.11.1.7) contained in such objects. A PID framework can be solely dependent on existing web resolution protocols or it can entail the interaction of proxy-based resolvers (3.11.3.7).
Note 2 to entry: A PID framework in the sense of this International Standard also allows resolution of other information associated with the PID.
URI naming scheme
top level of the Uniform Resource Identifier (URI) (3.11.2.1.2) naming structure
Note 1 to entry: Every scheme specifies its own syntax conventions for URIs.
Note 2 to entry: Typical URI schemes include http, https, ftp, mailto, etc. and are registered with IANA.
3.2.6 Roles, Institutions and Services
resource provider
organization that makes a resource (3.11.1.1) available online
Note 1 to entry: A resource can also be a service.
resource server
computer that ultimately provides access to the object referenced by a specific client application (3.11.3.4) request
archiving institution
institution responsible for maintaining a digital archive (3.11.1.3.1)
client application
software application that accesses a remote service usually on another computer system
web client
client application (3.11.3.4) capable of accessing resources (3.11.1.1) on the web using the HTTP protocol
resolution system
system designed to support the submission of a PID (3.11.2.1.1) to a network service in order to receive in return one or more pieces of current information related to the identified object
Note 1 to entry: The information can include, among others, a location (URI (3.11.2.1.2)) of the object or metadata (3.12.2.1).
PID resolver
resolver
software application that translates a PID (3.11.2.1.1) into another more suitable identifier, that is a software application that translates a resource PID into its Uniform Resource Identifier (3.11.2.1.2) and in this way points a client application (3.11.3.4) to the location of the resource (3.11.1.1)
HTTP resolver proxy
resolver proxy
application that implements a service supporting the use of urlified (3.11.4.2)PIDs (3.11.2.1.1) to access resources (3.11.1.1) or other PID-related information, or both
3.2.7 Actions
resolve
translate an identifier (3.11.2.1) into another name or address suitable for accessing a resource (3.11.1.1)
Note 1 to entry: The resolution process may require multiple steps in order to obtain a suitable address for a resource.
urlify an identifier
encode an identifier (3.11.2.1) as a suitable Uniform Resource Identifier (3.11.2.1.2)
Note 1 to entry: For example, this might be done with the purpose of creating an actionable identifier (3.11.2.1.2.1).
dereference
access the value referred to by a reference (3.11.1.10)
Note 1 to entry: When used within the context of dereferencing a URI (3.11.2.1.2), it means obtaining a representation of the resource (3.11.1.1) to which the URI points.
3.3 Infrastructure for Component Metadata
3.3.1 General Terms
registry
central directory designed for the persistent provision of negotiated information that can be reliably accessed
Note 1 to entry: A registry can be a software service that allows registering and for the registry to be queried for information.
metadata component registry
component registry
registry (3.12.1.1) of metadata components (3.12.2.6) and metadata profiles (3.12.2.7) for their sharing
semantic registry
directory of (authoritative) definitions of term (3.12.1.4), concept (3.12.1.3) or data category (3.9.2)
Note 1 to entry: These registries generally also provide persistent identifiers (3.11.2.1.1) for their entries.
concept registry
semantic registry (3.12.1.2) maintaining concepts (3.12.1.3)
EXAMPLE The CLARIN Concept Registry ([39]) as used in the CLARIN infrastructure.
concept
unit of knowledge created by a unique combination of characteristics
[SOURCE: ISO 1087, 3.2.7, modified — Note 1 to entry and Note 2 to entry have been deleted.]
term
designation that represents a general concept (3.12.1.3) in a specific domain or subject
EXAMPLE “planet”, “tower”, “pen”, “numeral”, “number”, “square root”, “logarithm”, “unit of measurement”, “base of a logarithm”, “chemical element”, “chemical compound”, “HP Laserjet 1100”, “Nobel Prize in Physics”.
Note 1 to entry: Terms may be partly or wholly verbal.
Note 2 to entry: Terms can include letters and letter symbols, numerals, mathematical symbols, typographical signs and syntactic signs (e.g. punctuation marks, such as hyphens, parentheses, square brackets and other connectors or delimiters), sometimes in character styles (i.e. fonts and bold, italic, bold italic, or other style conventions) governed by domain-, subject-, or language-specific conventions.
[SOURCE: ISO 1087, 3.2.7]
language tag
textual code used to assist in identifying language (3.16.1) in every mode of communication
Note 1 to entry: This includes constructed and artificial language (3.16.1.5) but excludes languages not intended primarily for human communication, for example in spoken, written, signed, or otherwise signaled, communication (see IETF BCP 47).
Note 2 to entry: Language tags may be used to assist in the identification of a language in every mode of communication, for example in spoken, written, signed, or otherwise signaled, communication.
concept reference
DEPRECATED:
reference (3.11.1.10) to the definition of a concept (3.12.1.3) in a concept registry (3.12.1.2.1)
concept link
reference (3.11.1.10) from a CMD profile (3.12.3.13), CMD component (3.12.3.3), CMD element (3.12.3.4), CMD attribute (3.12.3.5) or a value in a controlled vocabulary (3.12.2.14) to an entry in a semantic registry (3.12.1.2) via a Uniform Resource Identifier (3.11.2.1.2)
Note 1 to entry: Typically a concept link is provided as a persistent identifier (3.11.2.1.1).
media type
DEPRECATED: MIME type
specification used originally for textual, non-textual, multi-part message bodies of emails and which provides technical format information on data
EXAMPLE image/jpeg, image/svg+xml, text/plain, text/html, text/turtle, video/H264, application/xhtml+xml.
Note 1 to entry: There is a description in IETF RFC 6838.
Note 2 to entry: “MIME type” is the older term for “media type”. It is not used in standardization or technical specifications anymore.
Note 3 to entry: Registry of Internet media types is available at: https://www.iana.org/assignments/media-types.
resource collection
collection
〈CMDI〉 grouping of multiple, different constituting elements, each of which is independent of the others and may be accessed individually
Note 1 to entry: A collection can be a virtual collection if its constituent elements come from other different (virtual) collections, and possibly if the elements are distributed over different digital repositories (3.11.1.3).
3.3.2 Metadata
metadata
resource (3.11.1.1) that is a description of another resource, usually given as a set of properties in the form of attribute-value pairs
Note 1 to entry: This description can contain information about the resource, aspects or parts of the resource and/or artefacts and actors connected to the resource.
record
structured information that can be read by software services
metadata record
metadata description
DEPRECATED: metadata
record (3.12.2.2) containing a description of a resource (3.11.1.1)
metadata schema
DEPRECATED: schema
specification of a format and structure for a metadata record (3.12.2.2.1)
Note 1 to entry: In the context of ISO 24622-1, a machine-readable and verifiable format specification usually defined by an XML Schema (3.12.4.6) language.
metadata element
resource property name that can be used in metadata (3.12.2.1) and that can be given a value
Note 1 to entry: A metadata element is referred to as metadata attribute in other communities.
EXAMPLE The DCMI elements ([42]).
metadata element set
metadata set
resource collection of metadata elements (3.12.2.4) used within a particular discipline, tradition, or practice to describe resources (3.11.1.1)
Note 1 to entry: A metadata set is more general than a metadata schema (3.12.2.3) in that it does not additionally specify the syntax (e.g. the DCMI elements ([42])).
metadata component
grouping of metadata elements (3.12.2.4) and metadata components (3.12.2.6) that can be used to describe a specific aspect of a resource (3.11.1.1)
EXAMPLE The biographical data of a person or the contact information for an organization.
metadata profile
set of metadata components (3.12.2.6) that can be used together to describe a resource (3.11.1.1) and be transformed into a metadata schema (3.12.2.3)
Note 1 to entry: A metadata profile can be transformed into different metadata schemas that are still logically equivalent (i.e. they give logically equivalent resource descriptions).
metadata element value scheme
value scheme
specification of the value domain of a metadata element (3.12.2.4)
cardinality
specification of the number of occurrences of a metadata component (3.12.2.6) or metadata element (3.12.2.4) in an instantiation
metadata editor
actor that creates metadata records (3.12.2.2.1) to describe specific resources (3.11.1.1)
metadata modeler
actor that creates new metadata schemas (3.12.2.3) for new types of resources (3.11.1.1) or new applications
Note 1 to entry: In ISO 24622-1, metadata schemas are created by producing metadata profiles (3.12.2.7), which in turn form specifications for a metadata schema.
metadata provider
〈organization〉 organization that makes metadata (3.12.2.1) available
metadata provider
〈software service〉 software service that makes metadata (3.12.2.1) available
controlled vocabulary
DEPRECATED: closed vocabulary
DEPRECATED: open vocabulary
〈CMDI〉 set of values that can be used either to constrain the set of permissible values or to provide suggestions for applicable values in a given context
open vocabulary
set of items forming part of the value domain of a metadata element (3.12.2.4) on the recommendation of the metadata modeler (3.12.2.11)
closed vocabulary
limited set of items that forms the mandatory value domain of a metadata element (3.12.2.4)
Unified Modeling Language
UML
language (3.16.1) for specifying, visualizing, constructing, and documenting the artifacts of software systems and abstract models in general
3.3.3 Component Metadata Infrastructure (CMDI)
CMDI
component metadata infrastructure
metadata description framework consisting of the CMD model (3.12.3.2) and infrastructure to process instances of parts of the model
CMD model
component metadata model
metadata model that is based on CMD components (3.12.3.3)
Note 1 to entry: For a specification see ISO 24622-1.
CMD component
component
reusable, structured template for the description of (an aspect of) a resource (3.11.1.1), defined by means of a CMD specification (3.12.3.6) document with the potential of including other CMD components, either through reference or inline definition
CMD root component
CMD component (3.12.3.3) that is defined at the highest level within a CMD profile (3.12.3.13) that may have one or more child CMD components but no siblings
Note 1 to entry: In the CMD instance payload (3.12.3.12), it is instantiated exactly once.
inline CMD component
CMD component (3.12.3.3) that is created and stored within another CMD component and cannot be addressed from other CMD components
CMD element
element definition
unit within a CMD component (3.12.3.3) that describes the level of the CMD instance (3.12.3.9) that can carry atomic values (3.3.1.1) governed by a value scheme (3.12.3.17), and does not contain further levels except for that of the CMD attribute (3.12.3.5)
CMD attribute
unit within a CMD element (3.12.3.4) that describes the level at which properties of a CMD element (3.12.3.4) can be provided by means of value-scheme (3.12.3.17)-constrained atomic values (3.3.1.1)
CMD specification
component specification
component definition
representation of a CMD component (3.12.3.3) or CMD profile (3.12.3.13), expressed using the constructs of the CCSL (3.12.3.18)
CMD specification header
component header
profile header
section of a CMD specification (3.12.3.6) marked as ‘header’, providing information on that CMD specification as such that is not part of the defined structure
CMD component registry
component registry
service where a CMD specification (3.12.3.6) can be registered and accessed
CMD instance
CMDI file
metadata instance
CMDI instance
metadata record
CMD record
file that conforms to the general CMD instance structure and, at the CMD instance payload (3.12.3.12) level, follows the specific structure defined by the CMD profile (3.12.3.13) to which it relates
Note 1 to entry: The general CMD instance structure is described in ISO 24622-2.
CMD instance envelope
section of a CMD instance (3.12.3.9) which is structured uniformly for all instances and contains the CMD instance header (3.12.3.11) and the list of resource proxies (3.12.3.15) which may be referenced from the CMD instance payload (3.12.3.12) section
CMD instance header
section of a CMD instance (3.12.3.9) marked as ‘header’, providing information on that CMD instance as such, not the resource (3.11.1.1) that is described by the metadata file
CMD instance payload
section of a CMD instance (3.12.3.9) that follows the structure defined by the CMD profile (3.12.3.13) it references and contains the description of the resource (3.11.1.1) to which that CMD instance relates
CMD profile
profile
structured template for the description of a class of resources (3.11.1.1) providing the complete structure for a CMD instance payload (3.12.3.12) by means of a hierarchy of CMD components (3.12.3.3)
CMD profile schema
schema definition by which the correctness of a CMD instance (3.12.3.9) with respect to the CMD profile (3.12.3.13) it pertains to can be evaluated
Note 1 to entry: The CMD profile schema can be expressed as an XML Schema (3.12.4.6) but also in other XML schema languages.
resource proxy
CMD resource proxy
DEPRECATED: CMD resource reference
representation of a resource (3.11.1.1) within a CMD instance (3.12.3.9) containing a Uniform Resource Identifier (3.11.2.1.2) as a reference (3.11.1.10) to the resource itself and an indication of its nature
resource proxy reference
reference (3.11.1.10) from any point within the CMD instance payload (3.12.3.12) to any of the resource proxy (3.12.3.15) elements
value scheme
〈CCSL〉 set of constraints governing the range of values allowed for a specific CMD element (3.12.3.4) or CMD attribute (3.12.3.5) in a CMD instance (3.12.3.9), expressed in terms of an XML Schema datatype (3.12.4.9), controlled vocabulary (3.12.2.14), or regular expression (3.12.4.10)
CCSL
CMDI component specification language
XML (3.12.4.1)-based language for describing a CMD component (3.12.3.3) and a CMD profile (3.12.3.13) in accordance with the CMD model (3.12.3.2)
3.3.4 Extensible Markup Language (XML)
XML
markup language for describing hierarchical structures within a text (3.16.9) file
XML document
document represented in XML (3.12.4.1)
XML element
constituent of an XML document (3.12.4.2)
XML container element
XML element (3.12.4.3) that has one or more XML elements as its descendants
XML attribute
property of an XML element (3.12.4.3)
foreign attribute
〈CMDI〉 XML attribute (3.12.4.4) defined in a XML namespace (3.12.4.5) other than those declared in CMDI (3.12.3.1), to be included in a CMD instance (3.12.3.9) as additional information targeted to specific receivers or applications
XML namespace
namespace
method for qualifying element and attribute names used in XML (3.12.4.1)
XML Schema document
XML Schema
document that complies with the XML Schema recommendation, as defined in [44]
XML attribute declaration
constituent of an XML Schema document (3.12.4.6) that constrains the structure and content of a specific XML attribute (3.12.4.4)
XML element declaration
constituent of an XML schema that constrains the structure and content of a specific XML element (3.12.4.3)
XML schema datatype
predefined set of permissible content within an XML element (3.12.4.3) or an XML attribute (3.12.4.4) of an XML document (3.12.4.2) used in an XML Schema
regular expression
sequence of characters that denote a set of strings
Note 1 to entry: When used to constrain a lexical space, a regular expression asserts that only strings in the defined set of strings are valid literals for values of that type.
Note 2 to entry: See also [45], Appendix F.
3.4 Corpus Query Lingua Franca (CQLF)
CQL
corpus query language
formal language (3.16.1.3) designed to retrieve specific information from (large) language data collections, and thereby incorporate certain abstractions over commonly shared data models that make it possible for the end user (3.13.9) (or user agents) to address parts of those data models
Note 1 to entry: A CQL defines a syntactic notation for query expression (3.13.17) and the corresponding search semantics, i.e. an intensional specification of the intended result set. For most current CQLs, semantics are implicitly defined by a particular implementation.
CQLF implementation
query language that has been analysed with respect to the criteria described by the CQLF Metamodel, and thus has been “located” in the proposed feature matrix as “conformant with CQLF”
CQLF class
top-level division in the CQLF data model
Note 1 to entry: The CQLF Metamodel distinguishes two classes: Single-stream (where the annotation structure (3.7.11.1) is built upon a single data stream, typically a character stream) and Multi-stream (corresponding to e.g. multi-modal corpora or parallel corpora).
CQLF level
part of the matrix of QL properties, defined in terms of the general features of the assumed corpus data models, and consequently the set of properties of a corpus query language (3.13.1) that is used to address these features
Note 1 to entry: The CQLF Metamodel distinguishes three levels of complexity within the Single-stream class: Linear, Complex and Concurrent.
CQLF module
subcomponent of the CQLF metamodel, defined with reference to a specified data-model characteristic
Note 1 to entry: The CQLF metamodel currently distinguishes three modules within CQLF Level 1, Linear (plain-text, segmentation and simple annotation (3.2.7.6)), and three modules within CQLF Level 2, Complex (hierarchical, dependency and containment).
Note 2 to entry: In ISO 24623-2, the containment module is formalized by the concept SpanContainment in order to avoid terminological ambiguity.
concurrent annotations
multiple, potentially conflicting annotation (3.2.7) describing, entirely or partly, the same character span (3.13.7) or an overlapping sequence of character spans
Note 1 to entry: Concurrent annotations may be expected to conflict in several ways: content-wise (with different tags for the same character span), structure-wise (assuming different structural arrangements within the targeted character spans), and also in terms of segment edges (which is typically due to structurally conflicting claims concerning the encompassing character spans). Concurrent annotations typically come from different sources (e.g. tools or human annotators) or result from different settings (e.g. different parsing models or segmentation rules) within a single tool. When encoded in XML (3.12.4.1), concurrent annotations are typically expressed by means of stand-off techniques.
character span
sequence of characters, identified by start and end offsets, to which an annotation (3.2.7) may be applied
Note 1 to entry: Cf. region (3.5.3).
character span containment
relation between character spans (3.13.7) of primary data (3.2.4) in which character span A contains character span B if the initial offset of span A is equal to or higher than that of span B, and the final offset of span A is smaller than or equal to that of span B
Note 1 to entry: The relation of character span containment is used for stating a relationship between two or more character spans or simple annotation (3.2.7.6), without the need to utilize tree-based concepts and mechanisms. Instead of tree traversal, operators such as contains, in or within are typically used for character span containment queries.
end user
agent who uses a CQL (3.13.1) to satisfy his or her search needs (3.13.10)
Note 1 to entry: This can be done via an interactive graphical user interface (GUI), a command-line tool, programmatically via some application programming interface (API) or by a software programme developed by the end user.
search need
information pattern that an end user (3.13.9) wants to locate in a corpus (3.18.1.1), based on the primary data stream and/or simple or complex annotation (3.2.7)
CQL capability
facility provided by CQLs (3.13.1) to meet a specific aspect of search needs (3.13.10)
CQLF ontology
ontology for a fine-grained description of the expressive power of CQLs (3.13.1) in terms of search needs (3.13.10), which adheres to the structure specified in ISO 24623-2
functionality
label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a family of CQL capabilities (3.13.11) contributing to the expressive power of a CQL (3.13.1), formulated at a general level and linked to one or more CQLF module (3.13.5)
frame
label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a typical search need (3.13.10) of end users (3.13.9), understood as one facet of the expressive power of CQL (3.13.1)
Note 1 to entry: Most frames arise from the specialization of a functionality (3.13.13) and/or the combination of multiple functionalities.
use case
label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a concrete instantiation of a frame (3.13.14), for which it can be determined unambiguously whether a given query expression (3.13.17) satisfies the search need (3.13.10) or not
Note 1 to entry: Use cases are often parameterized, i.e. they contain variable elements. Parameterized use cases are satisfied by parameterized query expressions.
layer
totality of concepts (3.12.1.3) at the same level of abstraction in a CQLF ontology (3.13.12)
EXAMPLE Functionalities (3.13.13), frames (3.13.14), use cases (3.13.15).
query expression
string that is syntactically valid in a given CQL (3.13.1) and can be executed to return a result set
Note 1 to entry: Query expressions are often parameterized with variable elements. No formal specification of the parameter substitution procedure is attempted, but entries for parameterized query expressions in the ontology are required to include informal descriptions of the range of admissible values and any transformations required.
parameter
variable element in a query expression (3.13.17) or in the description of a search need (3.13.10)
positive conformance statement
assertion that a given CQL (3.13.1) supports a given use case (3.13.15) by means of a query expression (3.13.17)
negative conformance statement
assertion that a given CQL (3.13.1) cannot support a given use case (3.13.15), frame (3.13.14) or functionality (3.13.13)
Note 1 to entry: Negative conformance is due to technical unavailability of specific capabilities in the respective CQL or limitations on the complexity of query expressions (3.13.17).
3.4.1 Word Segmentation of Written Texts
word segmentation
process of splitting text (3.16.9) into a sequence of word segmentation units (3.14.2)
word segmentation unit
WSU
word form (3.1.13) or character string of some other type that is treated as a unit
Note 1 to entry: A character string that is not a word form may consist of numeric characters, foreign characters, punctuation marks or some other miscellaneous characters such as Chinese radicals, chemical symbols, such as H2O, or a mixture of Latin and numeric characters, such as F16.
lexicalization
process of making a linguistic unit function as a word (3.1.9.1)
Note 1 to entry: Such a linguistic unit can be a single morph (3.1.7), e.g. “laugh,” a sequence of morphs, e.g. “apple pie” or even a phrase (3.1.25), such as “kick the bucket”, that forms an idiomatic phrase.
reduplication
process in which the entire word (3.1.9.1), or part of it, is repeated
3.4.2 Transcription of Spoken Language
spoken language
oral language (3.16.1) produced by a person’s vocal system
paralinguistic feature
feature of spoken language (3.15.1) beyond the individual sound(s)
EXAMPLE voice quality, pitch, volume, intonation
transcription system
theoretically founded set of principles and rules detailing what spoken language (3.15.1) phenomena are to be transcribed, and how they are to be transcribed
transcriber
person who carries out the transcription (3.1.30)
orthographic transcription
representation or modelling of spoken language (3.15.1) based on the orthography (3.9.4) of the respective language (3.16.1)
phonetic transcription
representation or modelling of spoken language (3.15.1) based on the sound system of the respective language (3.16.1)
dependent annotation
annotation (3.2.7) which does not refer directly to an audio or video recording, but to another annotation
Note 1 to entry: Typically, a dependent annotation refers to an orthographic transcription (3.15.5) or phonetic transcription (3.15.6).
milestone element
empty XML element (3.12.4.3) used to indicate a boundary point
3.4.3 Controlled Natural Language (CNL) / Controlled Human Communication (CHC)
language
system of signs paired with meanings, thus, being used as a means of conveying information
natural language
NL
language (3.16.1) with its origin unknown, but continuously developing sometimes in idiosyncratic ways as is used conventionally for human communications
simplified language
language (3.16.1) generated through a simplification (3.16.15) process
formal language
language (3.16.1) that has been devised for logical inferences or programming applications with a finite list of symbols and a finite set of formation rules based on these symbols that define well-formed sentences (3.1.27) and also with a system that interprets these sentences
special language
special-purpose language
SPL
language (3.16.1) used in a subject-specific field and also characterized by the use of specific linguistic means of expression
Note 1 to entry: The stricter the conventions of an SPL are systematized and made obligatory, the more they converge with controlled natural language (3.16.2).
artificial language
language (3.16.1) that has been specifically devised for some applications
Note 1 to entry: The grammar of an artificial language is formulated systematically for some specific purposes of its used in practical applications especially in the area of human or human-machine communications.
controlled natural language
CNL
controlled language
subset of natural languages (3.16.1.1) whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity
Note 1 to entry: As a generic, CNL is an uncountable noun that refers to the abstract properties of all controlled natural languages and not to a particular natural language or application for a specific purpose. It is engineered (i.e. constructed) with a view to reducing or eliminating ambiguity and complexity and aims both to make it easier for human readers [particularly non-native users, non-experts, and people with limited comprehension (3.16.11)] to read a text (3.16.9) and to improve the computational processing of a text.
Note 2 to entry: CNL is an engineered (i.e. constructed) language that is based on a particular natural language, but is more restrictive as regards lexicon, syntax (3.1.24), or semantics, while at the same time preserving most of its natural properties. Here, CNL is a countable noun.
plain language
communication in which wording, structure and design are so clear that the intended readers can easily
[SOURCE: ISO 24495-1, 3.1]
technical communication
process of defining and creating information for use to be delivered as information products for the safe, effective, and efficient use of a supported product throughout its life cycle
[SOURCE: ISO 24183, 3.1.1, modified — Notes 1 to 3 to entry deleted.]
internationalization
process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re-design
Note 1 to entry: Internationalization takes place at the level of programme design and document development.
localization
process of taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold
Note 1 to entry: The term derives from “locale”: a place where something particular happens or is done. Translation (T9n) is one of the activities in localization.
basic principles and methodology for stylistic guidelines
BSG
guidelines (3.18.1.8) specifying common writing rules applicable to many languages
keyword
word (3.1.9.1) or phrase (3.1.25) used to describe the main content (3.16.10) (nouns and verbs) of a document in a consistent manner
text
data in the form of character arrangements intended to convey a meaning and whose interpretation is essentially based on the knowledge of some natural language (3.16.1.1) or artificial language (3.16.1.5)
[SOURCE: ISO/IEC 2382‑1:1993]
Note 1 to entry: Character arrangements are, among others, characters, symbols, words (3.1.9.1), phrases (3.1.25), paragraphs, sentences (3.1.27) or tables.
content
information content
information contained in or conveyed by a language (3.16.1)
Note 1 to entry: The information can be in written or spoken form or other forms such as images.
comprehension
process of understanding the content (3.16.10) of a document
content management
〈language resource management〉 process of controlling the content (3.16.10) of a text (3.16.9) or the media in general while analysing or revising it
Note 1 to entry: This includes version control of revised documents, contents in versions of similar documents, and the management of relations between items in a document.
authoring
writing a document
Note 1 to entry: Documents include, among others, reports, manuals, articles, or books.
pre-editing
process of modification of a text (3.16.9) before it is submitted to a specific processing
Note 1 to entry: A specific processing can be machine translation.
simplification
process of reducing complexity
Note 1 to entry: Simplified language (3.16.1.2) is the result of a simplification of content (3.16.10).
rewriting
producing a new version of a text (3.16.9) by changing its lexical, sentential, or textual structures while keeping its original content (3.16.10)
re-use
use a document or data for purposes in addition to those for which it was originally designed
Note 1 to entry: Ability to use existing documents for new documents. This includes making a product manual for a new version of the product and one for a similar version.
cooperative work
activity or result of working together to achieve the same goal
Note 1 to entry: Work carried out by more than one person in a collaborative way (e.g. technical communicators and editors putting together a manual).
readability
ease of processing a text (3.16.9) for its comprehension (3.16.11)
tractability
computational tractability
capability of being controlled, analysed, or generated
interoperability
〈language resource management〉 achievement of partial or total compatibility between heterogeneous data models by the mapping of metadata (3.12.2.1)
controlled vocabulary
CV
〈CNL/CHC〉 list of lexical or phrasal items that are selected for the purpose of improving readability (3.16.19) in a particular domain
Note 1 to entry: Most controlled vocabularies target a specific, narrow domain. Unlike controlled natural language (3.16.2), they do not deal with grammatical issues (i.e. how to combine the terms needed to write complete sentences (3.1.27) ), but a good number of CNL approaches, especially domain-specific ones, include controlled vocabularies.
synonym
one of a set of different terms (3.12.1.4) that refer to the same entity
[SOURCE: ISO/IEC 2382:2015, 2121523, modified — The Notes to entry have been removed.]
paronym
word (3.1.9.1) for which the writing or pronunciation is very close to another word, but which has a different lexical meaning
distinctive feature
class of phonetically defined components of phonemes (3.1.2) that function to distinguish meaning
Note 1 to entry: In contrast to redundant features, distinctive features constitute relevant phonological features.
Note 2 to entry: See also [49], p. 134.
assimilation
articulatory adaptation of one sound to a nearby sound within a word (3.1.9.1) or at the junction between words with regard to one or more features
Note 1 to entry: See also [49], p. 40.
interference
influence of one linguistic system on another in either the individual speaker (3.7.2.3.1.1) or the speech community
Note 1 to entry: See also [49], p. 235.
3.4.4 Lexico-Morpho-Syntactic Principles and Methodology for Personal Data Recognition and Protection in Text
seme
Saussure’s signified with its different signifiers (instantiations) in text (3.16.9)
Note 1 to entry: Saussure was the first person to use the terminology “signified” and “signifier”. Saussure offered a “dyadic” or two-part model of the sign. He defined a sign as being composed of a “signifier” (signifiant) and a “signified” (signifié) (see [50] and [51]).
intension
set of characteristics that make up a concept
[SOURCE: ISO 1087-1:2000, 3.2.9, modified]
indicant
significant occurrence of interaction between lexical, morphological and syntactic phenomena or of one of these phenomena across a wide spectrum of languages (3.16.1) or in few languages or in just one language that is suited to identify personal data (3.17.5)
identifiable natural person
data subject
person who can be identified, directly or indirectly, in particular by reference to an identifier
Note 1 to entry: An identifier can be a name, an identification number, location data or an online identifier of a natural person. Further examples which are excluded from the examples in this document are references to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of the natural person.
[SOURCE: [52], Article 4 (1)]
personal data
any information relating to an identified or identifiable natural person (3.17.4)
[SOURCE: [52], Article 4 (1)]
processing
any operation or set of operations which is performed on personal data (3.17.5) or on sets of personal data, whether or not by automated means, such as collection, recording, organization, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction
[SOURCE: [52], Article 4 (2)]
pseudonymization
processing (3.17.6) of personal data (3.17.5) in such a manner that the personal data can no longer be attributed to a specific identifiable natural person (3.17.4) without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person
[SOURCE: [52], Article 4 (5), modified – “data subject“ replaced with most preferred term “identifiable natural person” within definition.]
3.4.5 Corpus Annotation Project Management
3.4.6 Corpus Annotation
corpus
collection (3.3.2) of natural language (3.16.1.1) data
[SOURCE: ISO 1087, 3.6.4, modified — The preferred term “text corpus” deleted. Note 1 to entry deleted.]
corpus annotation
action of adding interpretative linguistic or non-linguistic information to a corpus (3.18.1.1)
[SOURCE: [53], modified — “non-linguistic” added.]
annotation scheme
description of the structure of annotation (3.2.7)
annotation layer
layer forannotation (3.2.7) of a corpus
EXAMPLE Syntactic layer, lexical-semantic layer, entity layer.
annotation unit
specific segment of primary data (3.2.4) that is identified and labelled according to an annotation scheme (3.18.1.3)
EXAMPLE Word (3.1.9.1), phrase (3.1.25), clause (3.1.26), sentence (3.1.27), utterance (3.7.2.2).
corpus annotation project
project (3.18.2.1) aimed at enhancing a collection of corpora (3.18.1.1) with metadata (3.12.2.1) or labels that provide additional linguistic, non-linguistic, semantic, or structural information to facilitate analysis, research and the development of natural language processing (3.9.1) tools
resource
〈corpus annotation〉 inputs needed for the establishment, implementation, maintenance and improvement of an organization and its processes s
EXAMPLE People, infrastructure, environment, information, knowledge, suppliers, financial means.
[SOURCE: ISO/IEC/IEEE 24765, 3.3461]
guideline
official recommendation or advice that indicates policies, standards or procedures for how something should be accomplished
[SOURCE: ISO/IEC/IEEE 24765, 3.1774]
3.4.7 Project Management
project
temporary endeavour to achieve one or more defined objectives
[SOURCE: ISO 21502, 3.20]
project management
planning, organizing, monitoring, controlling and reporting of all aspects of a project (3.18.2.1), and the motivation of all those involved in it to achieve the project objectives
[SOURCE: ISO 22886, 3.9.7]
project charter
document that states the problem to be solved, the improvement goals, the project scope (3.18.2.4), the project milestones and the project roles and responsibilities
[SOURCE: ISO 13053-2, 2.26]
project scope
authorized work to accomplish agreed objectives
[SOURCE: ISO 21502, 3.25]
work breakdown structure
WBS
decomposition of the defined scope of a project (3.18.2.1) or programme into progressively lower levels consisting of elements of work
[SOURCE: ISO 21502, modified — Abbreviated term “WBS” added.]
schedule management plan
component of the project management (3.18.2.2) plan that establishes the criteria and the activities (3.18.2.8) for developing, monitoring and controlling the schedule
[SOURCE: ISO/IEC/IEEE 24765, 3.3619]
project phase
collection of logically related project activities (3.18.2.8) that culminates in the completion of one or more deliverables (3.18.2.21)
[SOURCE: ISO/IEC/IEEE 24765, 3.3181]
activity
identified piece of work that is required to be undertaken to complete a project (3.18.2.1), programme, portfolio or other related work
[SOURCE: ISO 21506, 3.2, modified — Note 1 to entry deleted.]
work package
group of activities (3.18.2.8) that have a defined scope, deliverable (3.18.2.21), timescale and cost
[SOURCE: ISO 21502, 3.30]
work package leader
work package team leader
role within project management (3.18.2.2) that is responsible for overseeing a specific work package (3.18.2.9)
process
systematic series of activities (3.18.2.8) directed towards causing an end result such that one or more inputs will be acted upon to create one or more outputs (3.18.2.20)
[SOURCE: ISO/IEC/IEEE 24765, 3.3037]
data validation
process (3.18.2.11) of systematically checking and verifying the accuracy, completeness and consistency of annotations (3.2.7) within the corpus (3.18.1.1) to ensure that the data meet predefined quality standards and guidelines (3.18.1.8)
process group
collection of related processes (3.18.2.11)
[SOURCE: ISO/IEC/IEEE 24765, 3.3057]
project communications management
set of processes (3.18.2.11) that are required to ensure timely and appropriate planning, collection, creation, distribution, storage, retrieval, management, control (3.18.2.22), monitoring and the ultimate disposition of project information
[SOURCE: ISO/IEC/IEEE 24765, 3.3156]
project cost management
set of processes (3.18.2.11) involved in planning, estimating, budgeting, financing, funding, managing and controlling costs so that the project (3.18.2.1) can be completed within the approved budget
[SOURCE: ISO/IEC/IEEE 24765, 3.3158]
project integration management
set of processes (3.18.2.11) and activities (3.18.2.8) needed to identify, define, combine, unify and coordinate the various processes and project management (3.18.2.2) activities within the project management process group (3.18.2.19)
[SOURCE: ISO/IEC/IEEE 24765, 3.3165]
project procurement management
set of processes (3.18.2.11) necessary to purchase or acquire products, services or results needed from outside the project team
[SOURCE: ISO/IEC/IEEE 24765, 3.3185]
project quality management
set of processes (3.18.2.11) and activities (3.18.2.8) of the performing organization that determine quality policies, objectives and responsibilities so that the project (3.18.2.1) will satisfy the needs for which it was undertaken
[SOURCE: ISO/IEC/IEEE 24765, 3.3186]
project scope management
set of processes (3.18.2.11) required to ensure that the project (3.18.2.1) includes all the work required, and only the work required, to complete the project successfully
[SOURCE: ISO/IEC/IEEE 24765, 3.3194]
project management process group
logical grouping of project management (3.18.2.2) inputs, tools and techniques, and outputs (3.18.2.20)
Note 1 to entry: The project management process groups include initiating processes (3.18.2.11), planning processes, executing processes, monitoring and controlling processes, and closing processes. Project management process groups are not project phases (3.18.2.7).
[SOURCE: ISO/IEC/IEEE 24765, 3.3173]
output
aggregated tangible or intangible deliverables (3.18.2.21) that form the project result
[SOURCE: ISO 21502, 3.14]
deliverable
unique and verifiable element that is required to be produced by a project (3.18.2.1)
[SOURCE: ISO 21502, 3.9]
control
comparison of actual performance with planned performance, analysing variances and taking appropriate corrective and/or preventive action as needed
[SOURCE: ISO 21506, 3.13, modified — “and/or” replaced “and”.]
data consistency
adherence to uniform and standardized guidelines (3.18.1.8) and criteria for annotation (3.2.6) (3.2.6) across the entire corpus (3.18.1.1), ensuring that all annotated elements follow the same rules and conventions, which facilitates reliable and reproducible analysis
stakeholder
person, group or organization that has interests in, or can affect, be affected by, or perceive itself to be affected by, any aspect of the project (3.18.2.1), programme or portfolio
[SOURCE: ISO 21502, 3.27]
[1] ISO 10241-1:2011, Terminological entries in standards — Part 1: General requirements and examples of presentation
[2] Bussmann, H. (1996) Routledge dictionary of language and linguistics. London: Routledge.
[3] ISO 19104:2016, Geographic information — Terminology
[4] UD Guidelines (Universal Dependencies). (n.d.). https://universaldependencies.org/guidelines.html
[5] ISO 1087:2019, Terminology work and terminology science — Vocabulary
[6] ISO 15924, Information and documentation — Codes for the representation of names of scripts
[7] ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)
[8] ISO 24610-1, Language resource management — Feature structures — Part 1: Feature structure representation
[9] ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure representation
[10] CLAWS-7 tagset. Available at: http://ucrel.lancaster.ac.uk/claws7tags.html
[11] EAGLES guidelines. Available at: https://www.ilc.cnr.it/EAGLES96/browse.html
[12] NKJP tagset. Available at: http://nkjp.pl/poliqarp/help/ense2.html
[13] ISO 15919:2001, Information and documentation — Transliteration of Devanagari and related Indic scripts into Latin characters
[14] Time Ontology in OWL. (2022, November 15). https://www.w3.org/TR/owl-time/
[15] Princeton University "About WordNet." WordNet. Princeton University. 2010. https://wordnet.princeton.edu/
[16] Bunt, H. (1985). Mass terms and model-theoretic semantics. Cambridge University Press.
[17] Hobbs, J. and Pan, F. (2004). An ontology of time for the semantic web. TALIP Special Issue on Spatial and Temporal Information Processing 3 (1) (2004), pp. 66-85
[18] Pustejovsky, J., Saurī, R., Setzer, A. and Ingria, B. (2004). TimeML Annotation Guidelines 1.2, unpublished
[19] ISO 24617-2, Language resource management — Semantic annotation framework (SemAF) — Part 2: Dialogue acts
[20] Goffman E., (1963) Behavior in Public Places. New York: Basic Books
[21] Ahn R. (2001) Agents, Objects, and Events: A computational approach to knowledge, observation, and communication. PhD Thesis, Eindhoven University of Technology.
[22] Speech Act. (2017, February 2). Glossary of Linguistic Terms. https://glossary.sil.org/term/speech-act
[23] Austin J.L, (1962) How to do things with words. Clarendon Press, Oxford, UK.
[24] Bales R.F., (1951) Interaction process analysis: a method for the study of small groups. Addison-Wesley, Cambridge).
[25] Bunt H.C., Palmer M.S. 2013, Conceptual and representational choices in defining an ISO standard for semantic role annotation, In the Proceedings of the ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-9) held in conjunction with the International Workshop on Computational Semantics, Potsdam, Germany, March, 2013.
[26] Kleiber Georges, Patry Richard, Ménard Nathan, 1993. “Anaphore Associative: Dans Quel Sens «roule»-t-Elle?” Revue Québécoise de Linguistique 22 (2): 139–162.
[27] Bunt Harry, Gilmartin Emer, Keizer Simon, Pelachaud Catherine, Petukhova Volha, Prevot Laurent, Theune Mariet, 2018. “Downward Compatible Revision of Dialogue Annotation.” In, 21–34. Santa Fé (New Mexico), USA.
[28] ISO/IEC Guide 99:2007, International vocabulary of metrology — Basic and general concepts and associated terms (VIM)
[29] Abbott, B. (2004) Definiteness and indefiniteness. In: Horn, L., Ward, G. (eds.) Handbook of Pragmatics. Oxford: Blackwell, pp. 122–149.
[30] Winter, Y. Ruys, E. (2011) Scope ambiguities in formal syntax and semantics. In: Gabbay, D., Guenthner, F. (eds.) Handbook of Philosophical Logic (2nd edition). Springer.
[31] Bunt H. (2023). The compositional semantics of QuantML annotations. In: Proceedings 19th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-19), Nancy, France, pp. 3–13.
[32] ISO 24617-7:2020, Language resource management — Semantic annotation framework — Part 7: Spatial information
[33] ISO 30042:2019, Management of terminology resources — TermBase eXchange (TBX)
[34] Levin, B. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press, 1993.
[35] Sanfilippo, A., Ananiadou, S., Gaizauskas, R., Saint-Dizier, P., Vossen, P., Bel, N., Bontcheva, K. et al. EAGLES LE3-4244 Preliminary Recommendations on Lexical Semantic Encoding Final Report, 1999.
[36] Fielding. R., et al. Hypertext Transfer Protocol — HTTP/1.1, IETF RFC 2616, June 1999.
[37] IETF RFC 3986:2005, Uniform Resource Identifier (URI): Generic Syntax
[38] González R., Suarez Araújo C.P., eds. Proceedings of the 3rd International Conference on Language Resources and Evaluation. Paris: European Language Resource Association. pp. 1321-1326, 2002.
[39] CLARIN Concept Registry. Available at https://www.clarin.eu/ccr/
[40] IETF BCP 47, Tags for Identifying Languages
[41] IETF RFC 6838:2013, Media Type Specifications and Registration Procedures
[42] Dublin Core Metadata Initiative (DCMI). Terminology’. http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/2004-12-08/#sect-7
[43] ISO 24622-1, Language resource management — Component Metadata Infrastructure (CMDI) — Part 1: The Component Metadata Model
[44] XML Schema Part 1: Structures Second Edition. (n.d.). https://www.w3.org/TR/xmlschema-1/
[45] W3C XSD, W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures Gao S., Sperberg-McQueen C. M, Thompson H. S., (eds.), W3C Recommendation 5 April 2012. Available at https://www.w3.org/TR/xmlschema11-1/
[46] ISO 24623-2, Language resource management — Corpus query lingua franca (CQLF) — Part 2: Ontology
[47] ISO 24495-1:2023, Plain language — Part 1: Governing principles and guidelines
[48] ISO 24183:2024, Technical communication — Vocabulary
[49] Gutehrlé N., Atanassova I., Cardey S., Langue contrôlée pour un système de messages et alertes dans un environnement de mobilité: gestion de l’ambiguïté, international workshop FUTURMOB-17, 5-7 September 2017, Montbéliard, France.
[50] Saussure, F. de. Cours de linguistique générale, 1922. Course of General Linguistics, translated and annotated by Roy Harris, 1990. London: Duckworth.
[51] Chandler, D. Semiotics: The Basics. New York: Routledge, 2017.
[52] Official Journal of the European Union Regulation (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 27 April 2016, General Data Protection Regulation Available at [last viewed 2020-04-22]: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN#d1e1374-1-1
[53] Leech G. Adding Linguistic Annotation [online]. In: Wynne, M. (ed.) Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 2005 Available at [accessed 2025-03-11]: https://llds.ling-phil.ox.ac.uk/guides/dlc/chapter2.html
[54] ISO/IEC/IEEE 24765:2017, Systems and software engineering — Vocabulary
[55] ISO 21502:2020, Project, programme and portfolio management — Guidance on project management
[56] ISO 22886:2020, Healthcare organization management — Vocabulary
[57] ISO 13053-2:2011, Quantitative methods in process improvement — Six Sigma — Part 2: Tools and techniques
[58] ISO 21506:2024, Project, programme and portfolio management — Vocabulary
3.12.1.6
A
abbreviated form 3.1.21
abbreviation 3.1.21
abstract resource 3.11.1.1.3
actionable identifier 3.11.2.1.2.1
activity 3.18.2.8
addressee 3.7.2.3.2
adjunct 3.6.6
admissibility constraint 3.3.16
admissible feature 3.3.15
admissible feature value 3.3.1.3
admissible value 3.3.1.3
ADN 3.1.9.1.3
adnoun 3.1.9.1.3
adornment 3.10.1
affix 3.1.8.2.1
affordance 3.7.8.7
affordance structure 3.7.8.7
agglutination 3.1.18
ALINK 3.7.1.9
allo-feedback act 3.7.2.7.1.1
alternation 3.3.31
anaphor 3.7.7.4
anchor 3.5.4
annotate 3.2.5
annotation 3.2.6
annotation 3.2.7
annotation document 3.5.2
annotation layer 3.18.1.4
annotation scheme 3.18.1.3
annotation structure 3.7.11.1
annotation tier 3.11.1.11
annotation unit 3.18.1.5
appropriate feature 3.3.15
archive 3.11.1.3.1
archiving institution 3.11.3.3
argument 3.7.3.1
artificial language 3.16.1.5
assimilation 3.16.26
atomic type 3.3.22.3
atomic value 3.3.1.1
attribute-value matrix 3.3.8
authoring 3.16.13
auto-feedback act 3.7.2.7.1.2
AVM 3.3.8
B
bag 3.3.3
base form 3.1.10
base quantity 3.7.9.1.1
base type 3.3.22.2
base unit 3.7.9.4.1
basic principles and methodology for stylistic guidelines 3.16.7
beginning 3.7.1.7.1
borrowing 3.1.22
bound morpheme 3.1.8.2
boxed label 3.3.19
BSG 3.16.7
built-in 3.3.17
bunsetsu 3.1.25.1
C
canonical form 3.1.10
cardinality 3.12.2.9
CCSL 3.12.3.18
character 3.1.33
character span 3.13.7
character span containment 3.13.8
chunk 3.6.2.1
circumstance 3.7.4.4
citation 3.11.1.9
class 3.7.4.6
clause 3.1.26
client application 3.11.3.4
closed vocabulary 3.12.2.14
closed vocabulary 3.12.2.16
CMD attribute 3.12.3.5
CMD component 3.12.3.3
CMD component registry 3.12.3.8
CMD element 3.12.3.4
CMD instance 3.12.3.9
CMD instance envelope 3.12.3.10
CMD instance header 3.12.3.11
CMD instance payload 3.12.3.12
CMD model 3.12.3.2
CMD profile 3.12.3.13
CMD profile schema 3.12.3.14
CMD record 3.12.3.9
CMD resource proxy 3.12.3.15
CMD resource reference 3.12.3.15
CMD root component 3.12.3.3.1
CMD specification 3.12.3.6
CMD specification header 3.12.3.7
CMDI 3.12.3.1
CMDI component specification language 3.12.3.18
CMDI file 3.12.3.9
CMDI instance 3.12.3.9
CNL 3.16.2
cognate 3.9.9
collection 3.3.2
collection 3.11.1.4
collection 3.12.1.8
communicative function 3.7.2.7.11
communicative segment 3.7.7.1
complex resource 3.11.1.1.2
complex value 3.3.1.2
component 3.12.3.3
component definition 3.12.3.6
component header 3.12.3.7
component metadata infrastructure 3.12.3.1
component metadata model 3.12.3.2
component registry 3.12.1.1.1
component registry 3.12.3.8
component specification 3.12.3.6
compound 3.1.9.1.1
compounding 3.1.20
comprehension 3.16.11
computational tractability 3.16.20
concatenation 3.3.32
concept 3.12.1.3
concept link 3.12.1.6.1
concept reference 3.12.1.6
concept registry 3.12.1.2.1
concurrent annotations 3.13.6
constituent 3.6.2
constraint 3.3.18
content 3.16.10
content management 3.16.12
context 3.7.2.7.10
control 3.18.2.22
controlled language 3.16.2
controlled natural language 3.16.2
controlled vocabulary 3.12.2.14
controlled vocabulary 3.16.22
cooperative work 3.16.18
coreference 3.7.7.5
corpus 3.18.1.1
corpus annotation 3.18.1.2
corpus annotation project 3.18.1.6
corpus query language 3.13.1
CQL 3.13.1
CQL capability 3.13.11
CQLF class 3.13.3
CQLF implementation 3.13.2
CQLF level 3.13.4
CQLF module 3.13.5
CQLF ontology 3.13.12
CV 3.16.22
D
DAG 3.4.2
data category 3.9.2
data consistency 3.18.2.23
data subject 3.17.4
data validation 3.18.2.11.1
DC 3.9.2
dcl 3.7.5.4
default value 3.3.1.4
definite description 3.7.10.3
definiteness 3.7.10.2
deliverable 3.18.2.21
dependency 3.6.4
dependency annotation 3.2.7.4
dependency relation 3.6.4
dependent annotation 3.15.7
dereference 3.11.4.3
derivation 3.1.19
derived quantity 3.7.9.1.2
derived unit 3.7.9.4.2
determinacy 3.7.10.4
dialogue 3.7.2.1
dialogue act 3.7.2.7
digital archive 3.11.1.3.1
digital identifier 3.11.2.1
digital repository 3.11.1.3
digraph 3.4.2
dimension 3.7.2.7.2
directed acyclic graph 3.4.2
directional relation 3.7.5.9
discourse 3.7.4.1
discourse connective 3.7.6.2
discourse entity 3.7.7.2
discourse relation 3.7.2.7.7
discourse relation 3.7.6.3
discourse structure 3.7.4.2
distinctive feature 3.16.25
distribution 3.7.10.5
distributivity 3.7.10.5
document creation location 3.7.5.4
domain 3.6.8
dynamic path 3.7.5.3.2.1
dynamic route 3.7.5.3.2.1
E
edge 3.5.5.2
edge 3.6.1.3
eigenplace 3.7.11.6
eigenspace 3.7.11.6
element definition 3.12.3.4
element name 3.7.5.17
empty feature structure 3.3.7.1
end 3.7.1.7.2
end user 3.13.9
ending 3.1.8.2.1.1
entity 3.7.3.4
eojeol 3.1.9.1.4
etymologizable 3.9.7
etymology 3.9.6
etymon 3.9.8
event 3.7.1.1
event set 3.7.10.1
event-path 3.7.5.3.2.1
eventuality 3.7.1.1
eventuality frame 3.7.3.5
eventuality modifier 3.7.3.6
exhaustivity 3.7.10.6
extension 3.3.13
extent 3.7.5.16
F
feature 3.3.5
feature admissibility constraint 3.3.16
feature specification 3.3.6
feature structure 3.3.7
feature system 3.3.24.1.1
feature system declaration 3.3.25
feature value 3.3.1
feedback act 3.7.2.7.1
feedback dependence relation 3.7.2.7.6
figure 3.7.5.7
finite state automata 3.4.1
first-order logic 3.7.11.4
foreign attribute 3.12.4.4.1
formal language 3.16.1.3
fragment 3.11.1.7
fragment identifier 3.11.2.1.3
frame 3.13.14
free morpheme 3.1.8.1
FSA 3.4.1
FSD 3.3.25
functional dependence relation 3.7.2.7.4
functional segment 3.7.2.7.3
functionality 3.13.13
G
GA 3.7.8.7.1
genericity 3.7.10.7
Gibsonian affordance 3.7.8.7.1
grammatical category 3.1.23
grammatical feature 3.9.3
grammatical function 3.6.3
graph 3.5.5
graph 3.6.1
graph notation 3.3.9
grapheme 3.1.31
graphic character 3.1.33
ground 3.7.5.8
guideline 3.18.1.8
H
habitat 3.7.8.4
head 3.1.25.2.1
head 3.6.2.3
hierarchical annotation 3.2.7.5
homograph 3.1.32
homophone 3.1.3
HTTP resolver proxy 3.11.3.8
hypernode 3.8.2
I
identifiable natural person 3.17.4
identifier 3.11.2.1
IE 3.7.12.1
implicational constraint 3.3.18.1
incarnation 3.11.1.5
incompatibility 3.3.11
indicant 3.17.3
individuation 3.7.10.9
inflected form 3.1.14
inflection 3.1.6.1
information content 3.16.10
information extraction 3.7.12.1
information state 3.7.2.7.10
inline CMD component 3.12.3.3.2
inline code 3.10.2
instant 3.7.1.7
intension 3.17.2
interference 3.16.27
internal part 3.11.1.6.2
internationalization 3.16.5
interoperability 3.16.21
interpretation 3.3.13.1
interpretation 3.7.11.3
inverse linking 3.7.10.13
K
keyword 3.16.8
L
label 3.6.9
landmark 3.7.5.8
language 3.16.1
language resource 3.11.1.1.1
language tag 3.12.1.5
layer 3.13.16
lemma 3.1.10
lemmatization 3.1.11
lemmatized form 3.1.10
lexeme 3.1.9
lexical database 3.2.1
lexical entry 3.2.2
lexical item 3.2.3
lexical resource 3.2.1
lexicalization 3.14.3
lexicon 3.2.1.1
linguistic annotation 3.2.7.2
linguistic structure 3.1.1
localization 3.16.6
location 3.7.5.3
logical form 3.7.11.2
low-level discourse structure 3.7.6.4
M
main clause 3.1.26.1
malmaldi 3.1.9.1.4
markable 3.7.1.8
markup language of measurable quantitative information 3.7.9.3
mass term 3.7.10.12
matrix notation 3.3.8
measurable quantitative information 3.7.9.2.1
measurable quantitative information extraction 3.7.12.2
measurable quantitative information markup language 3.7.9.3
measure 3.7.5.10
measure relation 3.7.5.11
measure word 3.1.23.1
measurement unit 3.7.9.4
media type 3.12.1.7
merge 3.3.27
MES 3.7.8.3
metadata 3.12.2.1
metadata 3.12.2.2.1
metadata component 3.12.2.6
metadata component registry 3.12.1.1.1
metadata description 3.12.2.2.1
metadata editor 3.12.2.10
metadata element 3.12.2.4
metadata element set 3.12.2.5
metadata element value scheme 3.12.2.8
metadata instance 3.12.3.9
metadata modeler 3.12.2.11
metadata profile 3.12.2.7
metadata provider 3.12.2.12
metadata provider 3.12.2.13
metadata record 3.12.2.2.1
metadata record 3.12.3.9
metadata schema 3.12.2.3
metadata set 3.12.2.5
metamodel 3.7.13.1
milestone element 3.15.8
MIME type 3.12.1.7
minimal embedding space 3.7.8.3
MLINK 3.7.1.10
model M 3.7.11.5
modifier 3.6.2.2
morph 3.1.7
morpheme 3.1.8
morpho-syntactic unit 3.1.13
morphology 3.1.6
morphosyntactic feature 3.1.15
morphosyntactic tag 3.4.3
morphosyntactic tagset 3.4.4
motion 3.7.5.12
motion-event 3.7.5.12
movement relation 3.7.5.14
mover 3.7.5.13
moving object 3.7.5.13
MQI 3.7.9.2.1
MQIE 3.7.12.2
multiset 3.3.3
multiword expression 3.1.9.2
MWE 3.1.9.2
N
namespace 3.12.4.5
natural language 3.16.1.1
natural language processing 3.9.1
negation 3.3.28
negative conformance statement 3.13.20
NL 3.16.1.1
NLP 3.9.1
node 3.5.5.1
node 3.6.1.2
non-consuming tag 3.7.5.17.1
non-locational spatial entity 3.7.5.6
non-terminal node 3.6.1.2.2
normalization 3.7.12.3
noun phrase 3.1.25.2
noun phrase head 3.1.25.2.1
NP 3.1.25.2
O
object 3.7.4.5
objectal relation 3.7.7.6
onomasiology 3.9.5
open vocabulary 3.12.2.14
open vocabulary 3.12.2.15
orientation relation 3.7.5.9
orientational relation 3.7.5.9
original artefact 3.5.1
orthographic transcription 3.15.5
orthography 3.9.4
output 3.18.2.20
P
paralinguistic feature 3.15.2
parameter 3.13.18
paronym 3.16.24
part 3.11.1.6
part of speech 3.1.23
partial order 3.3.24
partially ordered set 3.3.24
participant 3.7.2.3
participant set 3.7.10.1.1
particle 3.1.8.2.1.2
path 3.3.10
path 3.7.5.3.2
period 3.7.1.3
persistent identifier 3.11.2.1.1
personal data 3.17.5
phoneme 3.1.2
phoneme confusion 3.1.5
phonetic transcription 3.15.6
phrasal compound 3.1.9.1.2
phrase 3.1.25
PID 3.11.2.1.1
PID framework 3.11.2.2
PID resolver 3.11.3.7
place 3.7.5.2
plain language 3.16.3
point of event 3.7.1.7.3
point of reference 3.7.1.7.4
point of speech 3.7.1.6.1
point of text 3.7.1.7.5
POS 3.1.23
positive conformance statement 3.13.19
pre-editing 3.16.14
predicate 3.7.3.2
predicate argument structure 3.7.3.3
primary data 3.2.4
process 3.18.2.11
process group 3.18.2.12
processing 3.17.6
profile 3.12.3.13
profile header 3.12.3.7
project 3.18.2.1
project charter 3.18.2.3
project communications management 3.18.2.13
project cost management 3.18.2.14
project integration management 3.18.2.15
project management 3.18.2.2
project management process group 3.18.2.19
project phase 3.18.2.7
project procurement management 3.18.2.16
project quality management 3.18.2.17
project scope 3.18.2.4
project scope management 3.18.2.18
pseudonymization 3.17.7
published collection 3.11.1.4.1
Q
QI 3.7.9.2
QML 3.7.9.3
QS 3.7.8.5
qualia 3.7.8.5
qualia structure 3.7.8.5
qualifier 3.7.2.7.12
qualitative spatial relation 3.7.5.5
quantification 3.7.10.8
quantitative information 3.7.9.2
quantitative markup language 3.7.9.3
quantity 3.7.9.1
quasi-homophone 3.1.4
query expression 3.13.17
R
range restriction 3.3.1.3
re-entrancy 3.3.14
re-use 3.16.17
readability 3.16.19
record 3.12.2.2
reduplication 3.14.4
reference 3.7.7.3
reference 3.11.1.10
reference domain 3.7.10.10
reference segment 3.7.2.7.5
referent 3.7.7.2
referring expression 3.7.7.1.1
region 3.5.3
region 3.7.5.1
registry 3.12.1.1
regular expression 3.12.4.10
relational class 3.7.4.7
repository 3.11.1.3
representation 3.2.8
resolution system 3.11.3.6
resolve 3.11.4.1
resolver 3.11.3.7
resolver proxy 3.11.3.8
resource 3.11.1.1
resource 3.18.1.7
resource collection 3.12.1.8
resource collection incarnation 3.11.1.5
resource part 3.11.1.6
resource part identifier 3.11.2.1.4
resource provider 3.11.3.1
resource proxy 3.12.3.15
resource proxy reference 3.12.3.16
resource server 3.11.3.2
responsive communicative function 3.7.2.7.11.1
restrictor 3.1.25.2.2
rewriting 3.16.16
rhetorical relation 3.7.2.7.7
romanization 3.4.8.1
route 3.7.5.3.2
S
schedule management plan 3.18.2.6
schema 3.12.2.3
script 3.1.28
script conversion 3.4.7
search need 3.13.10
segment 3.7.4.3
segment 3.8.1
segmentation annotation 3.2.7.1
semantic annotation 3.2.7.3
semantic argument 3.9.11
semantic authoring 3.8.3
semantic content 3.7.2.7.8
semantic content category 3.7.2.7.9
semantic content type 3.7.2.7.9
semantic form 3.7.11.2
semantic predicate 3.9.12
semantic registry 3.12.1.2
semantic role 3.7.3.7
semantic type 3.3.26
seme 3.17.1
sender 3.7.2.3.1
sentence 3.1.27
sequential representation 3.6.10
simple annotation 3.2.7.6
simplification 3.16.15
simplified language 3.16.1.2
situation 3.7.6.1
SLINK 3.7.1.11
snapshot 3.11.1.8
source domain 3.7.10.11
space 3.7.5.3.1
spatial entity, non-locational 3.7.5.6
spatial relation 3.7.5.15
speaker 3.7.2.3.1.1
speaker role 3.7.2.4
special language 3.16.1.4
special-purpose language 3.16.1.4
speech act 3.7.2.5
SPL 3.16.1.4
spoken language 3.15.1
stakeholder 3.18.2.24
stand-off annotation 3.2.7.7
static path 3.7.5.3.2
stem 3.1.17
structure sharing 3.3.14
subcategorization frame 3.6.7
subordinate clause 3.1.26.2
subsumption 3.3.12
subtitle 3.10.3
subtype 3.3.22.1
supertype 3.3.22.2
synonym 3.16.23
syntactic argument 3.6.5
syntactic behaviour 3.9.10
syntactic edge 3.6.1.3
syntactic graph 3.6.1
syntactic head 3.6.2.3
syntactic node 3.6.1.2
syntactic tree 3.6.1.1
syntax 3.1.24
T
tag 3.7.5.17
technical communication 3.16.4
telic 3.7.8.6
telic affordance 3.7.8.7.2
temporal interval 3.7.1.3
temporal ordering relation 3.7.1.4
temporal unit 3.7.1.6
tense 3.7.1.2
term 3.12.1.4
terminal node 3.6.1.2.1
terminal part 3.11.1.6.1
text 3.16.9
time amount 3.7.1.5
TLINK 3.7.1.12
token 3.4.5
tokenization 3.4.6
topological link 3.7.5.5
tractability 3.16.20
trajectory 3.7.5.3.2.1
transcriber 3.15.4
transcription 3.1.29
transcription 3.1.30
transcription system 3.15.3
transliteration 3.4.8
turn unit 3.7.2.6
type 3.3.22
type 3.3.26
type declaration 3.3.23
type hierarchy 3.3.24.1
typed feature structure 3.3.7.2
typing 3.3.33
U
UML 3.12.2.17
underspecification 3.3.4
unification 3.3.29
Unified Modeling Language 3.12.2.17
Uniform Resource Identifier 3.11.2.1.2
union 3.3.30
unit 3.7.9.4
unit of measurement 3.7.9.4
URI 3.11.2.1.2
URI naming scheme 3.11.2.3
urlify an identifier 3.11.4.2
use case 3.13.15
utterance 3.7.2.2
V
valence 3.6.7
valency 3.6.7
validity 3.3.21
value 3.3.1
value restriction 3.3.1.3
value scheme 3.12.2.8
value scheme 3.12.3.17
version 3.11.1.2
vertex 3.5.5.1
voxeme 3.7.8.2
voxicon 3.7.8.1
W
WBS 3.18.2.5
web client 3.11.3.5
well-formedness 3.3.20
word 3.1.9.1
word class 3.1.23
word compound 3.1.9.1.1.1
word form 3.1.13
word lattice 3.4.9
word segmentation 3.14.1
word segmentation unit 3.14.2
word sense 3.1.12
word structure 3.1.16
word-formation 3.1.6.2
work breakdown structure 3.18.2.5
work package 3.18.2.9
work package leader 3.18.2.10
work package team leader 3.18.2.10
working language 3.10.4
WSU 3.14.2
X
XML 3.12.4.1
XML attribute 3.12.4.4
XML attribute declaration 3.12.4.7
XML container element 3.12.4.3.1
XML document 3.12.4.2
XML element 3.12.4.3
XML element declaration 3.12.4.8
XML namespace 3.12.4.5
XML Schema 3.12.4.6
XML schema datatype 3.12.4.9
XML Schema document 3.12.4.6
1) GoogleDoc is an example of a suitable product available commercially. This information is given for the convenience of users of this document and does not constitute an endorsement by ISO of this product. ↑
