ISO/DIS 9515

ISO/DIS 9515: Language resource management — Vocabulary

ISO/DIS 9515

ISO/TC 37/SC 4

Secretariat: KATS

Date: 2025-12-18

Language resource management — Vocabulary

Voting begins on: 2026-02-13 Voting terminates on: 2026-05-08

Gestion des ressources linguistiques — Vocabulaire

Voting begins on: 2026-02-13 Voting terminates on: 2026-05-08

DIS stage

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.

ISO copyright office

CP 401 • Ch. de Blandonnet 8

CH-1214 Vernier, Geneva

Phone: + 41 22 749 01 11

E-mail: copyright@iso.org

Website: www.iso.org

Published in Switzerland

Contents

Foreword iv

Introduction v

1 Scope 1

2 Normative references 1

3 Terms and definitions 1

3.1 General Linguistic Terms 1

3.2 Language Resource Management 10

3.3 Feature Structures 12

3.4 Morphosyntactic Annotation Framework (MAF) 17

3.5 Linguistic Annotation Framework (LAF) 19

3.6 Syntactic Annotation Framework (SynAF) 20

3.7 Semantic Annotation Framework (SemAF) 22

3.8 Comprehensive Annotation Framework (ComAF) 41

3.9 Lexical Markup Framework (LMF) 41

3.10 Multilingual Information Framework 43

3.11 Persistent Identification and Sustainable Access (PISA) 43

3.12 Infrastructure for Component Metadata 48

3.13 Corpus Query Lingua Franca (CQLF) 55

3.14 Word Segmentation of Written Texts 58

3.15 Transcription of Spoken Language 59

3.16 Controlled Natural Language (CNL) / Controlled Human Communication (CHC) 59

3.17 Lexico-Morpho-Syntactic Principles and Methodology for Personal Data Recognition and Protection in Text 63

3.18 Corpus Annotation Project Management 64

Bibliography 69

Index 72

Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).

Attention is drawn to the possibility that some of the elements of this document may be the subject of patent rights. ISO shall not be held responsible for identifying any or all such patent rights. Details of any patent rights identified during the development of the document will be in the Introduction and/or on the ISO list of patent declarations received (see www.iso.org/patents).

Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISO's adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.

This document was prepared by Technical Committee ISO/TC 37, Language and terminology, Subcommittee SC 4, Language resource management.

Any feedback or questions on this document should be directed to the user’s national standards body. A complete listing of these bodies can be found at www.iso.org/members.html.

Introduction

The main purpose of this document is to provide a systematic description of the concepts related to language resources and language resource management and to clarify the use of the terms in this field.

This document is addressed to anyone concerned with language resource management and in particular users of the standards published by ISO/TC 37/SC 4.

The layout follows the directions given in ISO 10241-1. Thus, the elements of an entry appear in the following order:

entry number;
preferred term(s);
admitted term(s);
abbreviated form(s);
definition;
example(s);
note(s).

Language resource management — Vocabulary

1.0 Scope

This document provides the terms and definitions for the standards of ISO/TC 37/SC 4 Language resource management.

2.0 Normative references

There are no normative references in this document.

3.0 Terms and definitions

The terminological entries in this document are presented in a mixed order under headings reflecting the subjects covered by the work of ISO/TC 37/SC 4. Systematic order is applied where possible in such a way that concepts related hierarchically are listed coherently with their entry numbers reflecting the positions in the concept system followed by concepts related associatively. Where there are concept relations across subjects, allocation of a concept to the respective subject is preferred to the coherent display of concept relations/concept systems. Concept relations are indicated by cross-references throughout this document.

ISO and IEC maintain terminology databases for use in standardization at the following addresses:

ISO Online browsing platform: available at https://www.iso.org/obp
IEC Electropedia: available at https://www.electropedia.org/

3.1 General Linguistic Terms

linguistic structure

composition of a language (3.16.1) at the level of sound, word (3.1.9.1), phrase (3.1.25), sentence (3.1.27), meaning, and discourse (3.7.4.1)

Note 1 to entry: The science of language is understood to consist of phonology (sound), morphology (3.1.6) (word units), syntax (3.1.24) (sentential structure), semantics (meaning, information), and pragmatics (discourse, context).

phoneme

smallest sound unit that can be segmented from the acoustic flow of speech and twhich can function as semantically distinctive units

EXAMPLE For English, examples for phonemes are /g/ and /k/ as in gap: cap or /m/ and /t/ in map: tap; these word pairs constitute minimal pairs in English.

[SOURCE: ^[2]]

homophone

one of two or more words (3.1.9.1) that are pronounced the same but differ in meaning and sometimes in spelling

[SOURCE: ISO 19104, 4.15, modified — Note 1 to entry has been removed.]

quasi-homophone

word (3.1.9.1) which differs from another by one or two phonemes (3.1.2)

Note 1 to entry: There can be one phoneme more or less in one of the two quasi-homophones (e.g.: aft-after), one different phoneme (e.g.: check-deck, feed-feet), or 2 different phonemes (e.g.: flap-slat).

phoneme confusion

confusion due to a phoneme (3.1.2) approximately or incorrectly pronounced, and interpreted as another phoneme according to the mother tongue of the receptor

Note 1 to entry: Phonemes can exist in one language (3.16.1) and not in other languages.

Note 2 to entry: Phonemes can be pronounced (spoken, emitted) with multiple accentuations, and be perceived differently by recipients (listeners) not necessarily receptive to the same phonetic and phonological systems.

morphology

description of the structure and formation of words (3.1.9.1)

Note 1 to entry: Morphology is traditionally divided into:

word-formation (3.1.6.2) dealing with the formation of complex lexemes (3.1.9) out of simpler lexemes: by means of derivation (3.1.19) (often signalled by affixation (3.1.8.2.1), i.e. addition of a morpheme (3.1.8)) or by means of compounding (3.1.20) (combining two or more lexemes);
inflection (3.1.6.1) that creates inflected forms (3.1.14).

inflection

branch of morphology (3.1.6), dealing with contextual realizations of lexemes (3.1.9) as inflected forms (3.1.14)

Note 1 to entry: Inflection is a grammatical rather than lexical process.

word-formation

branch of morphology (3.1.6), dealing with the creation of new lexemes (3.1.9) by the processes of derivation (3.1.19) or compounding (3.1.20)

morph

surface form represented by a unique morpheme (3.1.8)

EXAMPLE In English, the morphs of the plural morpheme “-s” include “-s”, “-en”, and “-NULL” (as in “boys”, “oxen”, and “sheep”), where “–NULL” has no unique surface form. Thus, the word “boys” consists of the two morphs, “boy” and “-s”, whereas the morphemes corresponding to the morphs “ox” and “-en” are “ox” and “-s”, respectively.

morpheme

exponent that signals a modification of a lexeme (3.1.9)

Note 1 to entry: There are two sub-types of morphemes: free morphemes (3.1.8.1) and bound morphemes (3.1.8.2).

Note 2 to entry: This definition adheres to a lexeme-based approach to morphology where it is the lexeme, not the morpheme, that encodes the linguistic sign. On this approach, the morpheme is a unit of form (an exponent) that marks various kinds of modifications (e.g. derivation or inflection) of a lexeme.

free morpheme

morpheme (3.1.8) that can be used as a word (3.1.9.1) by itself

EXAMPLE Given the word “goodness,” “good” is a free morpheme, whereas “-ness” is not. The latter is a bound morpheme (3.1.8.2).

bound morpheme

morpheme (3.1.8) that appears only together with one or several other morphemes

EXAMPLE 1 Chinese: 伟 means “great,” but cannot stand by itself as a word in text (3.16.9). Instead, it is used as a constituent element of many words, such as 伟大 (“great”), 伟人 (“giant”), and 雄伟 (“majesty”).

EXAMPLE 2 Korean: the suffix “-e”, which is equivalent to the English preposition “to” — as in “hakkyo-e” (to school) — is a bound morpheme.

affix

bound morpheme (3.1.8.2) which may be added to a stem (3.1.17) or a lexeme (3.1.9)

Note 1 to entry: Affixes can be classified into several sub-types such as prefix, suffix, infix and circumfix. Affixes can be derivational or they can be inflectional or agglutinative.

ending

〈Japanese text〉 agglutinative affix (3.1.8.2.1) of a verb or adjective

Note 1 to entry: Verbs and adjectives end with agglutinative forms, called “endings”. These endings may be a negative form, an adverbial form, a base form, an adnominal form, an assumption form or an imperative form.

particle

〈Japanese text〉 grammatical affix (3.1.8.2.1) agglutinated mostly to nominal forms but sometimes to other free-standing lexical items (3.2.3)

Note 1 to entry: The grammatical category particle can be treated as a part of speech (3.1.23).

EXAMPLE The noun phrase 学校へ(gakkoue) is analysed into a noun 学校 (gakkou) and a particle へ(e). The verb phrase 寒いね (samuine, ‘It is very cold, isn't it?’) is analysed into a verb 寒い (samui) and a particle ね(ne) which corresponds to the tag ‘isn't it?’.

lexeme

abstract, fundamental unit in the lexicon (3.2.1.1) of a language, comprising semantic, formal (phonetic and/or graphemic) and grammatical information

Note 1 to entry: A complex lexeme is the result of word-formation (derivation or compounding) processes; a simple lexeme can be thought of as the base for such processes. In a lexical entry, a lexeme is identified by a lemma (3.8). Word-forms (3.5) are results of the interaction of lexemes with the grammatical system of the given language.

word

lexeme (3.1.9) that has, as a minimal property, a part of speech (3.1.23)

compound

word (3.1.9.1) built from two or more lexemes (3.1.9)

Note 1 to entry: A compound may be endocentric if it has a head (i.e. the fundamental part that contains the basic meaning of the whole compound) and modifiers (which restrict this meaning), or exocentric if it does not have a head. A compound can be long. There are two main sub-types of compound according to their degree of lexicalization (3.14.3): word compound (3.1.9.1.1.1) and phrasal compound (3.1.9.1.2).

word compound

compound (3.1.9.1.1) whose overall meaning is not totally predictable from its constituent parts

EXAMPLE “Hotdog,” “ice-cream,” “blackboard”.

phrasal compound

word (3.1.9.1) consisting of two or more lexemes (3.1.9), the meaning of which is predictable from its constituent elements

EXAMPLE “Apple pie” in English is a phrasal compound composed of two lexemes, “apple” and “pie”, whose meanings are preserved in the meaning of the compound.

Note 1 to entry: Idioms use two or more lexical items (3.2.3), but do not compose a phrasal compound.

Note 2 to entry: A phrasal compound might be thought of as phrases (3.1.25) by some linguists. In practice, however, there is not always a clear distinction between a word compound (3.1.9.1.1.1) and a phrasal compound, or between a phrasal compound and a phrase, due to the fuzziness of semantic predictability and the degree of lexicalization (3.14.3). Lexico-statistics — word frequency in particular — will play an important role in this respect.

adnoun

ADN

non-conjugating word (3.1.9.1) that modifies a noun

Note 1 to entry: Adnouns modify nouns, as adverbs modify verbs.

EXAMPLE 1	<Japanese>
	a. あらゆる国
	arayuru kuni
	ADN N
	‘every country’
	b. 好きな花
	suki+na hana
	ADNst+SX N
	‘favourite flower’
EXAMPLE 2	<Korean>
	a. 새 옷
	sae ot
	ADN noun
	‘new clothes’
	b. 빨간 옷
	bbalga+n ot
	ADJst+GX N
	‘red clothes’

eojeol

malmaldi

〈Korean text〉 word (3.1.9.1) or its variant word form (3.1.13) agglutinated with grammatical affixes (3.1.8.2.1)

Note 1 to entry: White space (space between characters) helps to segment text (3.16.9) into eojeols.

EXAMPLE		내가 사과를 먹었다
	nae+ga sagwa+reul meok+eot+da
	pronoun+GX noun+GX Vst+GX+GX
	‘I’+SBJ ‘apple’+OBJ ‘eat’+PST+DCL = ‘I ate (an) apple’

Note 2 to entry: This sentence consists of three eojeols: 내가, 사과를 and 먹었다, each of which is separated by white space. The acronyms GX, SBJ, OBJ, PST and DCL in the example above stand for grammatical affix, subject, object, past tense and declarative sentential type, respectively. The pronoun 내 is a variant form of the pronoun 나 referring to the speaker. 먹었다 is an eojeol and at the same time is a word form agglutinated with two grammatical affixes 었 and 다 to a verb stem (3.1.17) 먹.

multiword expression

MWE

lexeme (3.1.9) made up of a sequence of two or more lexemes that has properties that are not necessarily predictable from the properties of the individual lexemes or their normal mode of combination

EXAMPLE “To kick the bucket”, an idiomatic expression which means to die rather than to hit a bucket with one's foot. An idiomatic expression is a subtype of MWE whose properties are not predictable from the properties of the individual lexemes.

Note 1 to entry: An MWE can be a compound (3.1.9.1.1), a fragment of a sentence (3.1.27), or a sentence. The group of lexemes making up an MWE can be continuous or discontinuous. It is not always possible to mark an MWE with a part of speech (3.1.23).

lemma

lemmatized form

canonical form

base form

conventional form chosen to represent a lexeme (3.1.9)

Note 1 to entry: In European languages, the lemma is usually the singular if there is a variation in number, the masculine form if there is a variation in gender, and the infinitive for all verbs. In some languages, certain nouns are defective in the singular form; in these cases, the plural is chosen. For verbs in Arabic, the lemma is usually deemed to be the third person singular with the accomplished aspect.

Note 2 to entry: The term “lemma” is most often used in the context of corpora, as a device to capture the identity of tokens (3.4.5) and establish basic correspondence between a token and a lexical entry (3.2.2). The term that corresponds to lemma in the context of lexicons (3.2.1.1) is “headword”. Mismatches between the two are possible due to the varying macro- and microstructure of lexical entries. In order to handle such mismatches, apart from lemmas, direct references to dictionary entries are sometimes added to tokens or word forms (3.1.13) in corpora.

lemmatization

process of determining the lemma (3.1.10) for a given word form (3.1.13) in a context

EXAMPLE Given the word “found” in English, lemmatization results in “find” as its lemma.

word sense

meaning associated with a lexeme (3.1.9) in a context

Note 1 to entry: The ‘river bank’ sense of bank and the ‘financial institution’ sense of bank are considered to be two different word senses (3.1.12), or lexical units, with the same word form (3.1.13), or lexeme (3.1.9). I called him on the radio and Call me a taxi are associated to different word senses of the lexeme call. Unrelated senses, as in bank, are called homonyms. Senses of the same word form or lexeme which are clearly related (and can be difficult to distinguish) are called polysemes, e.g. Coins with an image of the king, preoccupied with body image, evokes a strong mental image.

word form

morpho-syntactic unit

abstract instantiation of alexeme (3.1.9) with the values of morphosyntactic feature (3.1.15) fixed in a syntactic context

EXAMPLE In English, the strings “find”, “finds”, “found” and “finding” are word forms of the word “find”.

Note 1 to entry: Word forms may have no acoustic or graphic realization, or may correspond to one or more tokens (3.4.5).

Note 2 to entry: Word-forms can have no acoustic or graphic realization, or can correspond to one or more tokens, not necessarily forming a contiguous sequence.

inflected form

concrete form that a lexeme (3.1.9) can take when used in a sentence (3.1.27) or a phrase (3.1.25)

morphosyntactic feature

feature induced from either the inflected form (3.1.14) of a lexeme (3.1.9) or from its syntactic context, or both

EXAMPLE “grammaticalGender”.

Note 1 to entry: Universal Dependencies (see ^[4]) offer a set of general and language-specific features and values, designed for pragmatically uniform cross-linguistic grammatical description.

word structure

internal structure of a word (3.1.9.1) resulting from the morphological analysis

Note 1 to entry: In agglutinative languages, such as Korean, Japanese and Turkish, a word may consist of a sequence of morphemes (3.1.8), with a comparatively high morpheme-per-word ratio, where each affix (3.1.8.2.1) involved (both derivational and inflectional) typically expresses a particular grammatical meaning in a clear, one-to-one way. The structure of a word in these languages can be very sophisticated, with free morphemes (3.1.8.1) and separate affixes as its constituent elements.

stem

linguistic unit whose form is smaller than or equal to the form of a single lexeme (3.1.9) and that may be affected by an inflectional, agglutinative, compositional or derivational process

agglutination

process of concatenating one or more affixes (3.1.8.2.1) to a stem (3.1.17)

derivation

change in the form of a word (3.1.9.1) to create a new word

Note 1 to entry: The change is usually done by modifying the stem (3.1.17) or by affixation.

compounding

word formation in which a new word is formed by adjoining at least two lexemes (3.1.9), in their original forms or with slight transformations

abbreviation

abbreviated form

designation that is formed by omitting parts from its full form and that represents the same concept (3.12.1.3)

[SOURCE: ISO 1087, modified — Note 1 to entry has been deleted.]

borrowing

process of word formation in which a linguistic expression is adopted from another language (3.16.1), usually when no term (3.12.1.4) exists for the new object (3.7.4.5) or concept (3.12.1.3)

part of speech

POS

grammatical category

word class

category assigned to a word (3.1.9.1) based on its grammatical and semantic properties

EXAMPLE Noun, verb.

measure word

〈Chinese text〉 part of speech (3.1.23) defining, along with numbers, the quantity (3.7.9.1) of a given object, or identifying specific objects with demonstrative pronouns such as “this” and “that”

Note 1 to entry: Whereas English speakers say “one person” or “this person”, Chinese speakers say respectively 一个人 (yi ge ren; numeral + measure word + noun; one person) or 这个人 (zhe ge ren; demonstrative pronoun + measure word + person; this person), where 个 (ge) is a measure word.

Note 2 to entry: A set of “verbal measure words” is used to count the number of times an action occurs, rather than the number of items. For example, in the sentence 我去过三次北京 (wo qu guo san ci Beijing; pronoun + verb + auxiliary word + numeral + measure word + proper noun; I have been to Beijing three times), the 次(ci) functions as a measure word to combine with a numeral 三 to derive the adverb三次 (sanci) that modifies the verb 去(qu).

syntax

way in which word forms (3.1.13) are interrelated and/or grouped together into phrases (3.1.25), thus capturing the relations that exist between those units

phrase

group of words (3.1.9.1) that perform a grammatical function (3.6.3) and that form a conceptual unit within a sentence (3.1.27)

Note 1 to entry: Empty phrases are permitted (being non-realised pronouns, sometimes marked as “pro”, and having the role of subjects in clauses (3.1.26)). A phrase is typically named after its syntactic head (3.6.2.3), for example noun phrases, verb phrases, adjective phrases, adverbial phrases and prepositional phrases. Phrases have been informally described as “bloated words”, in that the parts of the phrase added to the head elaborate and specify the reference of the head. In our model, a phrase is a special case of a constituent (3.6.2).

bunsetsu

〈Japanese text〉 phrase (3.1.25) without internal modifying relations

EXAMPLE	The sentence 私は学校へ早く行きました (I went to school early) consists of four bunsetsus: 私は(watashiwa), 学校へ (gakkoue), 早く(hayaku) and 行きました(ikimashita) in which
	私(watashi)	is a pronoun,
	は(wa)	is a particle (3.1.8.2.1.2),
	学校(gakkou)	is a noun,
	へ(e)	is a particle,
	早く(hayaku)	is an adjective in adverbial usage,
	行き(iki)	is a verbal stem (3.1.17) followed by
	まし(mashi)	is an auxiliary verb denoting politeness, and
	た(ta)	is an auxiliary verb indicating the past tense.

Note 1 to entry: A bunsetsu normally consists of a noun plus its particle(s) or a verb plus its ending(s) (3.1.8.2.1.1), auxiliary verb(s) or particle(s) as shown in the example above.

noun phrase

group of words that function together syntactically as a noun

Note 1 to entry: An NP typically consist of a noun, one or more determiners, and head modifiers. Other cases include NPs consisting of a personal pronoun, a proper name or a conjunction of nouns instead of a single noun.

noun phrase head

DEPRECATED: head

noun or a conjunction of nouns that forms the central element of an NP (3.1.25.2)

restrictor

part of an NP (3.1.25.2) consisting of the noun phrase head (3.1.25.2.1) and modifiers (3.6.2.2) (if present)

clause

group of phrases (3.1.25)

Note 1 to entry: A clause usually contains a predicate.

Note 2 to entry: A clause can be either a main clause (3.1.26.1) or a subordinate clause (3.1.26.2). In languages (3.16.1) distinguishing finiteness, clauses whose predicate is a verb can be either finite or non-finite, depending on the form of the verb. A main clause alone can build a complete sentence (3.1.27). In the SynAF model, a clause is a special case of a constituent (3.6.2).

main clause

clause (3.1.26), which can act on its own as a complete sentence (3.1.27)

Note 1 to entry: In languages (3.16.1) distinguishing finiteness, the main clause is usually finite.

EXAMPLE The train is late.

subordinate clause

clause (3.1.26) which fulfils a grammatical function (3.6.3) in a phrase (3.1.25) or in another clause

EXAMPLE A relative clause modifies the head noun of a nominal phrase.

Note 1 to entry: A subordinate clause usually does not act on its own as a sentence (3.1.27), but is part of a larger sentence.

sentence

related group of word forms (3.1.13) containing a predication

Note 1 to entry: A sentence consists of one or more clauses (3.1.26), usually expressing a complete thought and forming the basic unit of discourse structure (3.7.4.2). When describing speech, it is common to talk about utterances (3.7.2.2) rather than sentences.

script

set of graphic characters (3.1.33) used for the written form of one or more languages (3.16.1)

EXAMPLE Hiragana, Katakana, Latin and Cyrillic.

Note 1 to entry: The description of scripts ranges from a high-level classification such as hieroglyphic or syllabic writing systems versus alphabets to a more precise classification like Roman versus Cyrillic. Scripts are defined by a list of values taken from ISO 15924.

Note 2 to entry: A script, as opposed to an arbitrary subset of characters, is defined in distinction to other scripts; it is possible that readers of one script are unable to read another script easily, even where there is a historic relation between them.

[SOURCE: ISO/IEC 10646, 3.48, modified — Example and Note 1 to entry have been added.]

transcription

〈process〉 modelling of spoken language (3.15.1) by means of written symbols

transcription

〈process result〉 result of the process of transcription (3.1.29)

grapheme

minimal unit in a written language

EXAMPLE Letter, pictogram, ideogram, numeral, punctuation.

homograph

each of two or more word forms (3.1.13) or words (3.1.9.1) with identical spelling but representing different concepts (3.12.1.3) (semantic homography) or syntactic functions (syntactic homography)

graphic character

character

element of a writing system, whether or not alphabetical, that represents a grapheme (3.1.31), a syllable, a word (3.1.9.1) or even prosodic characteristics of the language, by using graphical symbols (letters, diacritical marks, syllabic signs, punctuation marks, prosodic accents, etc.) or a combination of these signs (a letter having an accent or a diacritical mark)

EXAMPLE a, B, ω or Γ are, therefore, characters as well as basic letters.

3.1.1 Language Resource Management

lexical resource

lexical database

database consisting of one or several lexicons (3.2.1.1)

lexicon

lexical resource (3.2.1) comprising a collection of lexical entries (3.2.2) for a language (3.16.1)

lexical entry

container for managing a set of word forms (3.1.13) and possibly one or more meanings to describe a lexeme (3.1.9)

lexical item

entry in a lexicon (3.2.1.1) that is a lexeme (3.1.9) or one of its variant forms

Note 1 to entry: Headed by a lemma (3.1.10), each lexical item may be either a free-standing word (or one of its variant word form (3.1.13)) or a bound (non-free-standing) form such as stems (3.1.17) and affixes (3.1.8.2.1).

primary data

electronic representation of language data

EXAMPLE Digital representations of text (3.16.9), transcription (3.1.29) of speech, gestures or multimodal dialogue (3.7.2.1).

Note 1 to entry: Typically, primary data objects are addressed by “locations” in an electronic file, for example, the span of characters comprising a sentence (3.1.27) or word, or a point at which a given temporal event begins or ends (as in speech annotation). More complex data objects may consist of a list or set of contiguous or non-contiguous locations in primary data.

Note 2 to entry: Semantic annotation (3.2.7.3) may relate to non-verbal or multimodal data, such as stretches of spoken dialogue with accompanying gestures and facial expressions, and even gestures and/or facial expressions without any accompanying speech.

annotate

add information to primary data (3.2.4)

annotation

〈process〉 adding information to primary data (3.2.4), independent of its representation (3.2.8)

annotation

〈markup〉 information added to primary data (3.2.4), independent of its representation (3.2.8)

segmentation annotation

annotation (3.2.7) that delimits linguistic elements that appear in the primary data (3.2.4)

Note 1 to entry: These elements include (1) continuous segments (appearing contiguously in the primary data), (2) super- and sub-segments, where groups of segments will comprise the parts of a larger segment (e.g. contiguous word segment typically comprise a sentence segment), (3) discontinuous segments (linking continuous segments), and (4) landmarks (e.g. timestamp) that note a point in the primary data. In current practice, segmental information may or may not appear in the document containing the primary data itself.

linguistic annotation

annotation (3.2.7) that provides linguistic information about the segments in the primary data (3.2.4)

EXAMPLE Morphosyntactic annotation in which a part of speech (3.1.23) and lemma (3.1.10) are associated with each segment in the data.

Note 1 to entry: The identification of a segment as a word, sentence (3.1.27), NP (3.1.25.2), etc. also constitutes linguistic annotation. In current practice, when it is possible to do so, segmentation and identification of the linguistic role or properties of that segment are often combined (e.g. syntactic bracketing, or delimiting each word in the document with an XML element (3.12.4.3) that identifies the segment as a word or sentence).

semantic annotation

annotation (3.2.7) which contains information about the meaning of a segment or region (3.7.5.1) of primary data (3.2.4)

dependency annotation

annotation (3.2.7) that encodes the dependency relations (3.6.4) between character spans (3.13.7)

Note 1 to entry: An example of a dependency relation (see ISO 24615-1:2014, 3.5) is one between a verb and its subject or direct object, between an attributive adjective and its head noun, or between a preposition and the head of its dependent NP (3.1.25.2). Dependency relations may be defined at the word-level alone, or may involve higher-level syntactic constructs, in which case it is possible to speak of mixed hierarchical-dependency annotations.

hierarchical annotation

annotation (3.2.7) that encodes the relationship of dominance (often also precedence) necessary to define syntactic trees (3.6.1.1) over character spans (3.13.7)

Note 1 to entry: Annotating hierarchical relationships requires only the relation of dominance to be indicated. Precedence is typically implicit in the ordering of character spans.

simple annotation

annotation (3.2.7) that constitutes a single information package whose interpretation is not dependent on other annotations

Note 1 to entry: This definition is intended to distinguish the simplest (“tabular”) kind of annotation from more complex relational structures (providing hierarchical, dependency, or alignment information); simple annotations are the only kind of annotations present at the linear level of complexity.

stand-off annotation

annotation (3.2.7) layered over primary data (3.2.4) and serialized in a document separate from that containing the primary data

Note 1 to entry: Stand-off annotations refer to specific locations in the primary data, by addressing character offsets, elements, etc. to which the annotation applies. Multiple stand-off annotation documents for a given type of annotation can refer to the same primary document (e.g. two different part of speech annotations for a given text (3.16.9)).

representation

format in which the annotation (3.2.7) is rendered, independent of its content

EXAMPLE XML (3.12.4.1), list or bracketed format, tab-delimited text (3.16.9).

3.1.2 Feature Structures

feature value

value

entity or aggregation of entities that characterize some property or aspect of another entity

Note 1 to entry: There are two kinds of feature values: atomic value (3.3.1.1) and complex value (3.3.1.2).

atomic value

feature value (3.3.1) without internal structure

Note 1 to entry: Feature structure (3.3.7) and collection (3.3.2) are not atomic values.

complex value

feature value (3.3.1) represented either as a feature structure (3.3.7) or as collection (3.3.2)

admissible feature value

admissible value

value restriction

range restriction

feature value (3.3.1) that the value of an admissible feature (3.3.15) must be subsumed by in feature structure (3.3.7) of a given type (3.3.22)

default value

feature value (3.3.1) otherwise assigned to a feature (3.3.5) when one is not specified

EXAMPLE Masculine is the default value of the grammatical gender in Dutch.

Note 1 to entry: A feature structure (3.3.7) may not bear a feature without a corresponding value.

collection

〈Feature structures〉 feature value (3.3.1) consisting of potentially many values, organized as a list, set or bag (3.3.3)

Note 1 to entry: A list is an ordered collection of entities (3.7.3.4) some of which may be identical. A set is an unordered collection of unique entities. A bag (3.3.3) is an unordered collection of entities that may or may not be unique; it is sometimes referred to as a bag.

bag

multiset

triple of an integer n, a set S and a function that maps the integers in the range, 1 to n, to elements of S

Note 1 to entry: A bag is halfway between a set (in that its elements are unordered) and a list (in that particular elements can occur more than once).

underspecification

provision of partial information about a feature value (3.3.1)

Note 1 to entry: An underspecification generally subsumes one of a range of candidate values that could be resolved to a single value through subsequent constraint resolution. See subsumption (3.3.12).

feature

property or aspect of an entity that is formally represented as a function mapping the entity to a corresponding feature value (3.3.1)

Note 1 to entry: The combination of feature and feature-value constitutes a feature specification (3.3.6). For example, number is a feature, singular is a value, and a pair <number, singular> is a feature speciﬁcation.

feature specification

pairing of a feature (3.3.5) with a feature value (3.3.1) in a feature structure (3.3.7) description

feature structure

record structure that associates one feature value (3.3.1) to each of a collection of features (3.3.5)

Note 1 to entry: Each feature value is either a feature structure or a simpler built-in (3.3.17) such as a string.

Note 2 to entry: Feature structures are partially ordered. The minimal feature structures in this ordering are the empty feature structures (3.3.7.1).

empty feature structure

feature structure (3.3.7) that contains no information

Note 1 to entry: An empty feature structure subsumes all other feature structures.

typed feature structure

feature structure (3.3.7) labelled by a type (3.3.22)

Note 1 to entry: In the graph notation (3.3.9), each node (3.5.5.1) is labelled with a type. In the matrix notation (3.3.8), a type is ordinarily placed at the upper left corner of the inside of the pair of square brackets that represents a typed feature structure. In XML (3.12.4.1) notation, the type is supplied as the feature value (3.3.1) of a type attribute on the <fs> element.

matrix notation

attribute-value matrix

AVM

notation that uses square brackets to represent feature structures (3.3.7)

Note 1 to entry: In a matrix notation, each row represents a feature specification (3.3.6), with the feature name and the feature value (3.3.1) separated by a colon (:), space ( ) or the equals sign (=).

graph notation

notation of feature structure (3.3.7) in a single rooted graph (3.5.5)

path

〈feature structures〉 sequence of labeled arcs connecting node (3.5.5.1) in a graph (3.5.5)

incompatibility

relation between two feature structures (3.3.7) which have conflicting types (3.3.22) or at least one common feature (3.3.5) with incompatible feature values (3.3.1)

Note 1 to entry: Two feature structures that are incompatible cannot be uniﬁed. The empty feature structure (3.3.7.1) is compatible with any other feature structure.

subsumption

relationship between two feature structures (3.3.7) in which one is more speciﬁc than the other

Note 1 to entry: A feature structure A is said to subsume a feature structure B if A is at least as informative as B. Subsumption is a reﬂexive, antisymmetric, and transitive relation between two feature structures.

extension

relationship between two feature structures (3.3.7) in which one is more general than the other

Note 1 to entry: A feature structure (3.3.7) F extends G if and only if G subsumes F.

Note 2 to entry: Converse of subsumption (3.3.12).

interpretation

〈feature structures〉 minimally informative (or equivalently, most general) extension (3.3.13) of a feature structure (3.3.7) that is consistent with a set of constraint (3.3.18) declared by a feature system declaration (3.3.25)

structure sharing

re-entrancy

relation between two or more feature (3.3.5) within a feature structure (3.3.7) that share a feature value (3.3.1)

admissible feature

appropriate feature

feature (3.3.5) for which any feature structure (3.3.7) of a given type (3.3.22) may bear a feature value (3.3.1)

Note 1 to entry: This term is often interpreted elsewhere to mean obligatory, i.e. feature structures of the given type must bear a value for every admissible feature. This term does not imply that the feature is obligatory here.

admissibility constraint

feature admissibility constraint

specification of a set of admissible features (3.3.15) and admissible feature values (3.3.1.3) associated with a specific type (3.3.22)

built-in

non-user-defined element that may appear in place of a feature structure (3.3.7)

Note 1 to entry: A built-in can appear, for example, as a feature value (3.3.1).

Note 2 to entry: Built-ins can be atomic or complex. The atomic built-ins are numeric, string, symbol and binary. The complex built-ins are collections (3.3.2) and applications of the operators, i.e. alternation (3.3.31), negation (3.3.28) and merge (3.3.27).

constraint

unit of specification that identifies some collection of feature structures (3.3.7) as invalid

Note 1 to entry: All constraints are implicational in their syntactic form, although some are distinguished as admissibility constraints (3.3.16). See validity (3.3.21) and ISO 24610-2, 5.4. All feature structures not explicitly excluded as invalid are considered to be valid.

Note 2 to entry: A feature structure that has not been so identified by any of the constraints in a feature system (3.3.24.1.1) is considered to be valid.

implicational constraint

constraint (3.3.18) of the form, “if G, then H,” where G and H are feature structure (3.3.7)

Note 1 to entry: This identifies any feature structure FF as invalid for which GG subsumes F, and yet F and H have no valid extension (3.3.13) in common. See subsumption (3.3.12) and ISO 24610-2, 8.5. Often used to refer to implicational constraints that are not also admissibility constraints (3.3.16).

boxed label

label (3.6.9) in box used in a matrix notation (3.3.8) to denote a feature value (3.3.1) shared by several features (3.3.5)

Note 1 to entry: The label may be any alphanumeric symbol.

well-formedness

syntactic conformity of a feature structure (3.3.7) representation to ISO 24610-1

validity

conformity of a typed feature structure (3.3.7.2) to the constraints (3.3.18) of a particular feature system (3.3.24.1.1)

type

name of a class of entities

Note 1 to entry: Feature structures (3.3.7) may be characterized by grouping them into certain classes. Types are used to name such classes.

subtype

type (3.3.22) to which another type confers its constraints (3.3.18) and admissible features (3.3.15)

supertype

base type

type (3.3.22) from which another type inherits constraints (3.3.18) and admissible features (3.3.15)

Note 1 to entry: s is a subtype of t if and only if t is a supertype of s. Every type is a subtype and supertype of itself.

atomic type

user-defined type (3.3.22) with no admissible features (3.3.15) declared or inherited

type declaration

structure that declares the supertypes (3.3.22.2), admissible features (3.3.15), admissible feature values (3.3.1.3), admissibility constraints (3.3.16) and implicational constraints (3.3.18.1) for a given type (3.3.22)

Note 1 to entry: The constraints on a type in the resulting feature system (3.3.24.1.1) are those that have been declared in its declaration, in addition to those that it has inherited from its supertypes.

partial order

partially ordered set

set S equipped with a relation ≤ over S × S that is (1) reﬂexive (for all s ∈ S, s ≤ s), (2) anti-symmetric (for all p, q ∈ S, if p ≤ q and q ≤ p, then p = q), and (3) transitive (for all p, q, r ∈ S, if p ≤ q and q ≤ r, then p ≤ r)

Note 1 to entry: The set of integers Z is partially ordered, but it has an additional property: for every p, q ∈ Z, either p ≤ q or q ≤ p. Not all partial orders have this property. The taxonomical classiﬁcation of organisms into phyla, genera and species, for example, is a partial order that does not. Type hierarchies may not necessarily. The typed feature structure (3.3.7.2) of a feature system (3.3.24.1.1) do not, unless (a) their type hierarchy (3.3.24.1) does, and (b) either the type hierarchy has exactly one type (3.3.22), or every type is constrained to have exactly one admissible feature (3.3.15).

type hierarchy

partial order (3.3.24) over a set of types (3.3.22)

Note 1 to entry: See ISO 24610-1, Annex C, Type inheritance hierarchies.

feature system

type hierarchy (3.3.24.1) in which each type (3.3.22) has been associated with a collection of admissibility constraints (3.3.16) and implicational constraints (3.3.18.1)

Note 1 to entry: Cf. type declaration (3.3.23).

feature system declaration

FSD

specification of a particular feature system (3.3.24.1.1)

semantic type

DEPRECATED: type

referring expression that distinguishes a collection of feature structures (3.3.7) as an identifiable and conceptually significant class

merge

generic operation that includes union (3.3.30) of sets or bags (3.3.3) and concatenation (3.3.32) of lists

negation

(unary) operation on a feature value (3.3.1) denoting any other value incompatible with it

uniﬁcation

operation that combines two compatible feature structures (3.3.7) into the least informative feature structure that contains the information from the two

union

operation that combines two sets, or bags (3.3.3), into one

Note 1 to entry: The corresponding operation for lists is concatenation (3.3.32).

alternation

operation on feature values (3.3.1) that returns one and only one of the values supplied as its argument

Note 1 to entry: Given a feature speciﬁcation F: a|b, where a|b denotes the alternation of a and b, F has either the value a or the value b, but not both.

concatenation

operation of combining two lists of feature values (3.3.1) into a single list

typing

assignment of a semantic type (3.3.26) to a built-in (3.3.17) or feature structure (3.3.7), either atomic or complex

Note 1 to entry: Semantic types in feature systems (3.3.24.1.1) are partially ordered, with multiple inheritance.

3.1.3 Morphosyntactic Annotation Framework (MAF)

FSA

finite state automata

graphs (3.5.5) made up of states with an initial state and a final state, and a finite set of transitions from state to state

Note 1 to entry: See also directed acyclic graph (3.4.2).

directed acyclic graph

DAG

digraph

graph (3.5.5) with directed edges (3.5.5.2) and no cycles

Note 1 to entry: DAGs are a subset of FSA (3.4.1).

morphosyntactic tag

label identifying a feature structure (3.3.7) used to qualify a word form (3.1.13) within an established taxonomy

Note 1 to entry: Morphosyntactic tags can be atomic labels (“N” for “noun”), but very often they are mnemonic representations for the feature structures that they identify (“NNL2” for “plural locative noun” in the CLAWS-7 tagset, see ^[10]). The relevant feature structures can also be encoded by character vectors, as in “N12201” for “common noun, feminine, plural, countable” in the EAGLES intermediate tagset (see ^[11]) or by agglutinated shorthand feature identifiers, as in “subst:pl:gen:m3” for “noun, plural, genitive, masculine, inanimate” in the NKJP tagset (see ^[12]).

morphosyntactic tagset

comprehensive set of morphosyntactic tags (3.4.3) used for the morpho-syntactic description of a language (3.16.1)

token

non-empty contiguous sequence of graphic character (3.1.33) in a document

Note 1 to entry: For editorial reasons, some annotation schemes extend the notion of token to an empty sequence.

tokenization

process that segments a language data stream into individual tokens (3.4.5)

script conversion

representing graphic characters (3.20) from a script (3.1.28) by the graphic characters of a target script, most commonly by romanization (3.4.8.1)

Note 1 to entry: The two basic methods of conversion of a system of writing are transliteration and transcription. The use of the terms “source script” and “target script” in transliteration is analogous to the terms “source language” and “target language” in translation.

[SOURCE: ISO 15919, 4.1, modified — “script” used as attribute of the main term.]

transliteration

representation of the graphic characters (3.1.33) of a source script (3.1.28) by the graphic characters of a target script

Note 1 to entry: In transcription, pronunciation conventions are of primary importance, while in transliteration, writing conventions are of primary importance.

[SOURCE: ISO 15919, 4.7]

romanization

conversion of non-Latin graphic characters (3.20) into Latin graphic characters, using either transliteration (3.4.8) or transcription (3.1.29)

word lattice

set of possible alternative decompositions of a text or speech segment into word forms (3.1.13)

Note 1 to entry: A word lattice has the algebraic properties of a directed acyclic graph (3.4.2) with an initial node (3.5.5.1) and a final node.

Note 2 to entry: See also DAG (3.4.2) and FSA (3.4.1).

3.1.4 Linguistic Annotation Framework (LAF)

original artefact

artefact or annotation (3.2.7) from which the primary data (3.2.4) is derived

annotation document

XML document (3.12.4.2) containing annotations (3.2.7)

region

〈linguistic annotation framework〉 area in the primary data (3.2.4) defined by a non-empty, ordered list of anchors (3.5.4)

anchor

fixed, immutable position in the primary data (3.2.4) being annotated (3.2.5)

Note 1 to entry: The medium determines how an anchor is described. For example, text (3.16.9) anchors may be character offsets, audio anchors may be time offsets, video anchors may be time offsets or frame indices, image anchors may be coordinates.

graph

set of nodes (3.5.5.1) (vertices) V(G) and a set of edges (3.5.5.2) E(G)

node

vertex

terminal point in a graph (3.5.5) G, or the intersection of edges (3.5.5.2) in G

edge

ordered pair of nodes (3.5.5.1) [u,v] from V(G)

Note 1 to entry: The order of the nodes determines the direction of the edge.

3.1.5 Syntactic Annotation Framework (SynAF)

syntactic graph

DEPRECATED: graph

connected set of syntactic nodes (3.6.1.2) and syntactic edges (3.6.1.3)

syntactic tree

syntactic graph (3.6.1) in which each syntactic node (3.6.1.2) has a single parent

syntactic node

DEPRECATED: node

word form (3.1.13) or constituent (3.6.2) seen as an elementary syntactic component of a syntactic analysis

terminal node

syntactic node (3.6.1.2) which is a single word form (3.1.13) or an empty element involved in a syntactic relation

non-terminal node

syntactic node (3.6.1.2) which is not a word form (3.1.13)

Note 1 to entry: A non-terminal node has an outgoing constituency syntactic edge (3.6.1.3).

syntactic edge

DEPRECATED: edge

triplet with a syntactic source node (3.6.1.2), a target node, and optional annotations (3.2.7)

Note 1 to entry: Non-terminal nodes (3.6.1.2.2) have an outgoing constituency syntactic edge.

constituent

syntactic grouping of words (3.1.9.1), phrases (3.1.25), or clauses (3.1.26) on the base of structural (or hierarchical) properties

Note 1 to entry: Words can be grouped into phrases, phrases into clauses or other phrases and clauses into sentences (3.1.27).

chunk

non-recursive constituent (3.6.2)

modifier

part of a constituent (3.6.2) which ascribes a property to the syntactic head (3.6.2.3) of the constituent

Note 1 to entry: A modifier can be placed before or after the head of the phrase (3.1.25) (pre-modifier or post-modifier). Modifiers are optional in a constituent.

syntactic head

DEPRECATED: head

part of a constituent (3.6.2) which determines its distribution and its grammatical properties

Note 1 to entry: The head of a constituent usually cannot be left out.

Note 2 to entry: Distribution here refers to the syntactic environments in which the constituent may appear.

Note 3 to entry: The syntactic head determines the grammatical properties of a constituent in such a way that if the grammatical gender of the head is feminine, then the gender of the entire constituent will be feminine.

grammatical function

grammatical role of a word form (3.1.13) or constituent (3.6.2) within its embedding syntactic environment

Note 1 to entry: For example, a noun phrase (NP) (3.1.25.2) can act as a subject within a sentence (3.1.27), or a noun may act as a subject dependent of a verb in a dependency graph. There is a grammatical relation between the subject – NP and the main verb in a sentence. All grammatical relations (subject – predicate, syntactic head (3.6.2.3) – modifier (3.6.2.2), etc.) are subsumed under the concept of dependency relations (3.6.4), whether between terminal nodes (3.6.1.2.1) or non-terminal nodes (3.6.1.2.2).

dependency relation

dependency

syntactic relation between word form (3.1.13) or constituent (3.6.2) on the basis of the grammatical functions (3.6.3) that constituents play in relation to each other

syntactic argument

one of the essential and functional constituents (3.6.2) in a clause (3.1.26) that identifies the participants in the process referred to by a lexeme (3.1.9)

EXAMPLE Alfred (syntactic argument) reads a book (syntactic argument) today (adjunct (3.6.6)).

adjunct

non-essential element associated with a verb as opposed to syntactic arguments (3.6.5)

Note 1 to entry: Adverbs are possible adjuncts for a sentence (3.1.27).

subcategorization frame

valency

valence

set of restrictions on a lexeme (3.1.9) indicating the properties of the syntactic arguments (3.6.5) that can or must occur with this given lexeme

domain

class of elements to which a certain set of labels (3.6.9) can be assigned

Note 1 to entry: Domains can refer generally to the set of all syntactic edges (3.6.1.3), terminal nodes (3.6.1.2.1) or non-terminal nodes (3.6.1.2.2).

label

unit of annotation (3.2.7) consisting of the name of a feature (3.3.5) and a feature value (3.3.1), which together can be applied to appropriate model elements and add arbitrary feature-value annotations to such elements

sequential representation

representation (3.2.8) of annotation content where the XML element (3.12.4.3) structure mirrors the sequence of linguistic objects in the primary source

3.1.6 Semantic Annotation Framework (SemAF)

3.1.7 Time and Events (SemAF-Time, ISO-TimeML)

event

eventuality

something that can be said to obtain or hold true, to happen or to occur

Note 1 to entry: This is a very broad notion of event that includes all kinds of actions, states, processes, etc. It is not to be confused with the narrower notion of event (as opposed to the notion of "state") as something that happens at a certain point in time (e.g. the clock striking two or waking up) or during a short period of time (e.g. laughing). In TimeML, the term “event” is used in a broader sense and is equivalent to the term “eventuality”.

tense

way that languages (3.16.1) express the time at which an event (3.7.1.1) described by a sentence (3.1.27) occurs

Note 1 to entry: This is characterized as a property of a verb form. Noun forms will not be said to exhibit tense but rather temporal markers.

temporal interval

period

uninterrupted stretch of time, with internal point structure

Note 1 to entry: Time is often viewed as a straight line from minus infinity to plus infinity. A temporal interval is a part of that line without any holes, containing all the points between its beginning (3.7.1.7.1) and its end (3.7.1.7.2).

Note 2 to entry: In mathematics, an important issue is whether an interval includes its beginning and its end (is “closed”) or not (is “open” or “half-open”). In natural language descriptions of intervals this may also be relevant, as when describing an interval in terms of a number of days, but not with the same granularity as in mathematics. Cf. ^[14].

[SOURCE: Adapted from ^[15].]

temporal ordering relation

relation that determines how objects are ordered in time

EXAMPLE Precedence, simultaneity.

Note 1 to entry: There is a limited number of ways to order objects which are collectively called ordering relations.

time amount

quantity (3.7.9.1) of time, measured by temporal unit (3.7.1.6) over temporal intervals (3.7.1.3)

Note 1 to entry: A time amount is a measure of time that can be expressed in terms of a number of temporal units, such as “half an hour” or “30 minutes”.

[SOURCE: Adapted from ^[16].]

temporal unit

element in a time amount (3.7.1.5) that quantifies the length of a temporal interval (3.7.1.3) or a set of temporal intervals

Note 1 to entry: In measurement systems, various units are defined for different purposes. Small units such as seconds and minutes are defined to measure small temporal intervals; as one may want to avoid working with big numbers, for larger temporal intervals, units such as week, year, decade, and century are defined.

Note 2 to entry: The amount of a temporal unit is called a measure (3.7.5.10).

[SOURCE: Adapted from ^[16].]

point of speech

temporal unit (3.7.1.6) at which a given utterance (3.7.2.2) occurs

Note 1 to entry: The notion of point of speech is needed in order to interpret tense (3.7.1.2). This requires the use of anchor points in time, of which the point of speech is one (point of text (3.7.1.7.5) is another one). For example, in “Arthur smiled”, the point of speech is the time that the utterance is made.

Note 2 to entry: For a document as a whole, this may be considered to be the same as the document creation time.

instant

point in time with no interior points

Note 1 to entry: Time is often viewed as a straight line from minus infinity to plus infinity. In this view, time is formed by an infinite sequence of points. An instant can also be seen as an infinitesimally small interval. Cf. ^[14] for "instant."

beginning

instant (3.7.1.7) at which a temporal interval (3.7.1.3) begins

[SOURCE: Adapted from ^[17].]

end

instant (3.7.1.7) at which a temporal interval (3.7.1.3) ends

[SOURCE: Adapted from ^[17].]

point of event

instant (3.7.1.7) at which the event (3.7.1.1) mentioned in a given utterance (3.7.2.2) occurs

Note 1 to entry: Next to a point of speech (3.7.1.6.1), a point of event also needs to be defined in order to interpret tense (3.7.1.2). For example, in “Arthur smiled”, the temporal location of the point of event can be defined as being prior to the point of speech.

point of reference

instant (3.7.1.7) of temporal perspective on the event (3.7.1.1) in a given utterance (3.7.2.2)

Note 1 to entry: “Arthur will have gone by tomorrow”, where the point of speech (3.7.1.6.1) is now, the point of event (3.7.1.7.3) is some time in the future, but before the point of reference referred to by “tomorrow”.

Note 2 to entry: To locate certain label (3.6.9) in time, a third anchor point is also required, defined as the point of reference (3.7.1.7.4).

point of text

instant (3.7.1.7) at which reported speech is anchored

Note 1 to entry: It is the point of time considered in the text (3.16.9) of the speech. So for example, when a person is telling a story, it is not enough to know the point of the speech itself (the document creation time), but the point at which the speech in the story is taking place.

markable

entity in general, or segment of a text (3.16.9) in particular, that is subject to an annotation (3.2.7)

ALINK

linking tag (3.7.5.17) that represents a phase relation between an aspectual verb (or morpheme (3.1.8)) and a predicate (3.7.3.2) denoting an event (3.7.1.1)

MLINK

linking tag (3.7.5.17) that represents the measurement of the duration of an event (3.7.1.1) or the measurement of the length of a (possibly discontinuous) time span

SLINK

linking tag (3.7.5.17) that represents a subordinating relation between two event (3.7.1.1)

TLINK

linking tag (3.7.5.17) that represents a temporal relation between two temporal entities: namely, between two event (3.7.1.1), two temporal expressions, or between a temporal expression and an event

Note 1 to entry: Some ordering relations cannot be expressed by an ordering relation between two events because a signal, like a temporal preposition, complicates the ordering or there is an ordering relation between a temporal signal and an event.

[SOURCE: Adapted from ^[18].]

3.1.8 Dialogue Acts

dialogue

exchange of utterance (3.7.2.2) between two or more persons or artificial agents

utterance

anything said, written, keyed, signed, or otherwise expressed, possibly in multimodal form

Note 1 to entry: An utterance is part of a turn unit (3.7.2.6). In the literature, the term is commonly used in the sense of ‘everything contributed by a sender within a turn unit’.

Note 2 to entry: The term ‘utterance’ is useful in the description of dialogue (3.7.2.1) behaviour, but is not of central importance in ISO 24617-2, since dialogue acts (3.7.2.7) are not assumed to correspond to utterances, but rather to the communicative behaviour in functional segment (3.7.2.7.3).

participant

person or artificial agent involved in a dialogue (3.7.2.1)

Note 1 to entry: Both entity (3.7.3.4) and event (3.7.1.1) can function as participants (3.7.2.3).

sender

participant (3.7.2.3) who performs a dialogue act (3.7.2.7)

speaker

sender (3.7.2.3.1) of a dialogue act (3.7.2.7) in spoken form

Note 1 to entry: A participant (3.7.2.3) can contribute to a dialogue without having the speaker role (3.7.2.4), for example by nodding in agreement to what the other participant says. Therefore, the term ‘speaker' is not synonymous with ‘participant who occupies speaker role'.

Note 2 to entry: A speaker possibly combines speech with nonverbal communicative behaviour.

addressee

participant (3.7.2.3) oriented to by the sender (3.7.2.3.1) in a manner to suggest that his/her utterance (3.7.2.2) are particularly intended for this participant, and that some response is therefore anticipated from this participant, more so than from the other participants

Note 1 to entry: This definition is a de facto standard in the linguistics literature.

[SOURCE: ^[20], modified — ‘speaker' replaced by ‘sender', and use of ambiguous pronouns avoided.]

speaker role

role occupied by a participant (3.7.2.3) who has temporary control of a dialogue (3.7.2.1) and speaks for some period of time

[SOURCE: DAMSL annotation scheme (see ^[21]).]

speech act

act that a speaker (3.7.2.3.1.1) performs when producing an utterance (3.7.2.2)

Note 1 to entry: The notion ‘utterance’ in this definition is commonly interpreted as mentioned in Note 1 to entry of 3.7.2.2.

[SOURCE: ^[22], modified — Note 1 to entry added.]

turn unit

stretch of communicative activity produced by one participant (3.7.2.3) who occupies the speaker role (3.7.2.4), bounded by periods of inactivity of that sender (3.7.2.3.1) or by periods where another participant occupies the speaker role

Note 1 to entry: The term ‘turn unit’ is also closely related to the term ‘turn construction unit’ (TCU), introduced by ^[23]. The TCU seems a rather intuitive and holistic notion, of which the usefulness has been the subject of debate (see e.g. ^[24]).

dialogue act

communicative activity of a participant (3.7.2.3), interpreted as having a certain communicative function (3.7.2.7.11) and semantic content (3.7.2.7.8)

Note 1 to entry: A dialogue act can additionally also have certain functional dependence relation (3.7.2.7.4), rhetorical relation (3.7.2.7.7) and feedback dependence relation (3.7.2.7.6) with other units in a dialogue.

feedback act

dialogue act (3.7.2.7) that provides or elicits information about the sender's (3.7.2.3.1) or the addressee (3.7.2.3.2) processing of something that was uttered in the dialogue (3.7.2.1)

Note 1 to entry: Two classes of feedback are distinguished: allo-feedback acts (3.7.2.7.1.1) and auto-feedback acts (3.7.2.7.1.2).

allo-feedback act

feedback act (3.7.2.7.1) where the sender (3.7.2.3.1) elicits information about the addressee's (3.7.2.3.2) processing of an utterance (3.7.2.2) that the sender contributed to the dialogue (3.7.2.1), or where the sender provides information about his perceived processing by the addressee of an utterance that the sender contributed to the dialogue

EXAMPLE	1. A: Now move up.
	2. B: Slightly northeast you mean?
	3. A: Slightly yeah
With utterance 3, A performs an allo-feedback act signalling that he/she thinks B understood utterance 1 correctly.

auto-feedback act

feedback act (3.7.2.7.1) where the sender (3.7.2.3.1) provides information about his/her own processing of an utterance (3.7.2.2) contributed to the dialogue (3.7.2.1) by another participant (3.7.2.3)

EXAMPLE B's utterance in the example dialogue fragment in 3.7.2.7.1.1 signals that he/she is uncertain whether he/she understood the previous utterance correctly.

dimension

class of dialogue acts (3.7.2.7) that are concerned with a particular aspect of communication, corresponding to a particular category of semantic content (3.7.2.7.8)

EXAMPLE 1 Dialogue acts advancing the task or activity that motivates the dialogue (3.7.2.1) (the ‘Task' dimension).

EXAMPLE 2 Dialogue acts providing and eliciting feedback (the auto- and allo-Feedback dimensions).

EXAMPLE 3 Dialogue acts for allocating the speaker role (3.7.2.4) (the turn management dimension).

functional segment

minimal stretch of communicative behaviour that has one or more communicative functions (3.7.2.7.11)

Note 1 to entry: The condition of being ‘minimal' ensures that functional segments do not include material that does not contribute to the expression of a communicative function that identifies the segment.

EXAMPLE The functional segment corresponding to the answer given by S in the following dialogue (3.7.2.1) fragment does not include the parts "Just a moment please" and “.... let me see..." but only the parts “the first train to the airport on Sunday morning is" and “at 5:45”.

U: What time is the first train to the airport on Sunday morning please?
S: Just a moment please... the first train to the airport on Sunday morning is .... let me see... at 5:45.

Note 2 to entry: A consequence of this definition is that functional segments can be discontinuous, can overlap or be embedded, and can contain parts from more than one turn.

functional dependence relation

relation between a dialogue act (3.7.2.7) with a responsive communicative function (3.7.2.7.11.1) and one or more previous dialogue acts that it responds to

EXAMPLE The relation between an answer and the corresponding question, such as between utterance 3 and utterance 2 in the example in 3.7.2.7.1.1; or the relation between the acceptance of an offer and the corresponding offer.

reference segment

stretch of communicative behaviour that a feedback dependence relation (3.7.2.7.6) refers to and that is not a functional segment (3.7.2.7.3)

feedback dependence relation

relation between a feedback act (3.7.2.7.1) and the stretch of communicative behaviour the processing of which the act provides or elicits information about

EXAMPLE In the example in 3.7.2.7.1.1, both the allo-feedback act (3.7.2.7.1.1) expressed by utterance 3 and the auto-feedback act (3.7.2.7.1.2) expressed by utterance 2 have a feedback dependence relation to utterance 1.

Note 1 to entry: Feedback dependence relations are also used to relate self-corrections, partner corrections, and other speech editing acts, which strictly speaking are not feedback acts, to the segments that they apply to.

rhetorical relation

DEPRECATED: discourse relation

semantic or pragmatic relation between two dialogue act (3.7.2.7) or their semantic content (3.7.2.7.8)

EXAMPLE 1 In the following example, the statement in the second utterance provides a motivation for the question in the first utterance:

A: Can you tell me what flights there are to Sydney on Saturday? I’d like to attend my mother's 80th birthday.

EXAMPLE 2 A rhetorical relation between the semantic contents of two dialogue act occurs in the following, where the content of B's statement mentions a cause for the content of A's statement:

A: I can never find these stupid remote controls.

B: That's because they don’t have a fixed location.

Note 1 to entry: Relations such as elaboration, explanation, justification, cause, and concession have been studied extensively in the analysis of (monologue) text (3.16.9), where they are often called ‘rhetorical relations' or ‘discourse relations', and are mostly viewed either as relations between text segments or as relations between events (3.7.1.1) or propositions, described in text segments. Many of these relations also occur in dialogue (3.7.2.1).

semantic content

information, situation (3.7.6.1), action, event (3.7.1.1), or objects (3.7.4.5) that a stretch of communicative behaviour refers to

semantic content category

semantic content type

type of the semantic content (3.7.2.7.8) of a dialogue act (3.7.2.7)

EXAMPLE The various dimensions (3.7.2.7.2) defined in this document correspond to categories of semantic content (3.7.2.7.8). In particular, the task dimension corresponds to the category of task-specific actions and information; the allo- and auto-feedback dimensions correspond to the categories of information about the processing by the current speaker (3.7.2.3.1.1) or by the addressee (3.7.2.3.2), respectively, of something that was said before; the turn management dimension corresponds to the category of information about the allocation of the speaker role (3.7.2.4), and so forth.

information state

context

totality of a participant's (3.7.2.3) attitudes that may influence the participant's interpretation and generation of communicative behaviour

Note 1 to entry: Attitudes include, among others, beliefs, assumptions, expectations, goals, preferences and hopes.

communicative function

property of certain stretches of communicative behaviour, describing how the behaviour changes the information state (3.7.2.7.10) of an understander of the behaviour

responsive communicative function

communicative function (3.7.2.7.11) of a dialogue act (3.7.2.7) that depends for its semantic content (3.7.2.7.8) on one or more dialogue acts that it responds to

qualifier

predicate (3.7.3.2) that can be associated with a communicative function (3.7.2.7.11)

EXAMPLE A: Would you like to have some coffee?

B: Only if you have it ready.

B's utterance accepts A's offer under a certain condition; this can be described by qualifying the communicative function Accept Offer with the predicate ‘conditional'.

3.1.9 Semantic Roles (SemAF-SR)

argument

formal semantic unit that is an essential element of a predicate argument structure (3.7.3.3) and can have variable instantiations depending on the utterance (3.7.2.2)

Note 1 to entry: An argument corresponds to a participant (3.7.2.3) of an event (3.7.1.1) described by the predicate argument structure.

Note 2 to entry: Arguments typically satisfy certain argument positions and can be described as being syntactico-semantic notions, whereas participants are semantico-conceptual. The standard view is that subsets of the participants associated with an event (3.7.1.1) are selected as arguments by the verb (or nominal or adjective) expressing the event. Other participants are either incorporated or realized as eventuality modifiers (3.7.3.6).

Note 3 to entry: Natural language predicates (3.7.3.2) typically have one, two, or three arguments, although they can have more.

predicate

formal semantic unit that represents a semantic relation between one or more arguments (3.7.3.1) in a predicate argument structure (3.7.3.3)

Note 1 to entry: Predicates are indicated by predicative linguistic elements such as verbs, nouns, and adjectives.

predicate argument structure

formal representation of the core semantic content (3.7.2.7.8) of an utterance (3.7.2.2), consisting of a predicate (3.7.3.2) constant, and its arguments (3.7.3.1)

Note 1 to entry: In classical logic-based semantics, this corresponds to predicate argument structures in first-order predicate logic.

Note 2 to entry: One of the arguments can be a variable uniquely identifying the instance of the predicate argument structure to allow references to it in other predicate argument structures.

Note 3 to entry: The representation of event semantics is subject to many variations; some of them, such as in ^[25], can have separate predicates for each semantic role (3.7.3.7) relation. In this case, the predicate argument structure of an utterance is the sum of the individual predicate semantic role assertions representing the semantic content of the utterance.

entity

conceptual semantic unit that typically functions as a participant (3.7.2.3)

Note 1 to entry: An entity is an individual such as a person, organization, physical object, or logical entity, as well as, on occasion, a number, quantity (3.7.9.1), dimension, or a reification of an event, a property, or a quality, e.g. emotion (anger, love), the value of a colour, etc.

Note 2 to entry: An entity is represented by a node (3.5.5.1) in a content structure.

eventuality frame

generalized abstract specification of the word sense (3.1.12) associated with an event (3.7.1.1) in an utterance (3.7.2.2)

Note 1 to entry: The frame consists of the specification of (a) a predicate (3.7.3.2) that can participate in a class hierarchy if such a hierarchy is specified, and (b) the arguments (3.7.3.1) that this predicate expects along with their semantic roles (3.7.3.7).

eventuality modifier

particular type of participant (3.7.2.3) that completes the description of an event (3.7.1.1) but is optional and not essential

Note 1 to entry: Eventuality modifiers are distinct from other types of participants in that they are used in supplying information that is typically more peripheral and more general, for example, situating the eventuality in time or space (3.7.5.3.1).

Note 2 to entry: In FrameNet, these would be peripheral frame elements and in PropBank, ArgM’s.

Note 3 to entry: Eventuality modifiers typically correspond to syntactic adjuncts.

semantic role

mode of involvement of a participant (3.7.2.3) in an event (3.7.1.1)

Note 1 to entry: Semantic roles for specific events are often associated with prototypical semantic relations, e.g. if John causes a breaking event, he is the agent; if he uses a hammer, it is the instrument; and someone who receives something is a recipient.

3.1.10 Discourse Structure (SemAF-DS)

discourse

process of communication, consisting of one or more sentences (3.1.27) or sentence fragments

Note 1 to entry: From an abstract viewpoint, data (e.g. words (3.1.9.1), phrases (3.1.25), sentences, and paragraphs) representing a communication process is regarded as a discourse. A discourse can be encoded in various media such as text (3.16.9), hypertext, audio, video, and their possible combinations.

discourse structure

structure of discourse (3.7.4.1), comprising segment structure, content structure, and possibly other types of structure

segment

〈semantic annotation framework〉 partial realization of discourse (3.7.4.1)

EXAMPLE Word (3.1.9.1), phrase (3.1.25), subordinate clause (3.1.26.2), sentence (3.1.27), paragraph, section, chapter.

Note 1 to entry: A synonym (3.16.23) is a ‘discourse segment’. A segment references a semantic and/or pragmatic entity, which can be a semantic/pragmatic relation. Intrasentential segments are syntactic constituents such as words, phrases, and clauses. Segments might or might not be continuous: this is discussed in the definition of connectives.

circumstance

entity (3.7.3.4) which is an event (3.7.1.1) (including dialogue act (3.7.2.7)), state, process (3.18.2.11), relation, proposition, or set of these

object

semantic entity (3.7.3.4) other than circumstance (3.7.4.4)

Note 1 to entry: Objects include people, buildings, machines, ideas, and rules.

class

unary predicate, which is a set of entities (3.7.3.4)

relational class

class (3.7.4.6) whose instances are circumstances (3.7.4.4) equivalent to relations

3.1.11 Spatial Information

region

〈semantic annotation framework〉 connected, non-empty point-set defined by a domain and its boundary points

Note 1 to entry: The term "region" as defined does not refer to a political or administrative region such as "the Canary Islands" or "Hong Kong, SAR", where SAR is the acronym of “Special Administrative Region”.

place

geographic or administrative entity that is situated at a location (3.7.5.3)

location

point or finite area that is positioned within a space (3.7.5.3.1) or a series of such points or areas

Note 1 to entry: Places (3.7.5.2), paths (3.7.5.3.2), and event-paths (3.7.5.3.2.1) are subtypes of locations.

space

dimensional extent (3.7.5.16) in which objects (3.7.4.5) and events (3.7.1.1) have a relative position and direction

path

static path

route

〈semantic annotation framework〉 location (3.7.5.3) that consists of a series of locations

Note 1 to entry: A spatial object path is a location where the focus is on the potential for traversal or which functions as a boundary. This includes common nouns like road, coastline, and river and proper names like Route 66 and KangamangusHighway. Some nouns, such as valley, can be ambiguous. It can be understood as a path in we walked down the valley or as a place (3.7.5.2) in we live in the valley.

Note 2 to entry: A path might be represented as an undirected graph whose nodes (3.5.5.1) are locations and whose edges (3.5.5.2) signify continuity; i.e., unlike an event-path (3.7.5.3.2.1), a path has no inherent directionality.

event-path

dynamic path

trajectory

dynamic route

directed path (3.7.5.3.2) followed by a mover (3.7.5.13) and coincident with a motion (3.7.5.12)

Note 1 to entry: Unlike (static) paths such as roads or circular tracks, event-paths are each triggered by a specific motion, characterized as being finite directed paths each with a start and an end.

document creation location

dcl

unique place (3.7.5.2) or set of places associated with a document that represents the location (3.7.5.3) in which the document was created

Note 1 to entry: Some collaboratively written documents, such as GoogleDoc^[1]1) documents and chat logs, might refer not only to a single location but also to a set of locations spread out across the world. Besides, for example, the creation place of the Hebrew bible or the creation place of each of the books in it is uncertain. The attribute @dcl will, therefore, have the value "false", understood to mean "unspecified", while the value "true", is understood to mean"specified".

qualitative spatial relation

topological link

abstract static relation between regions (3.7.5.1) or spaces (3.7.5.3.1), expressing their connectedness or continuity

non-locational spatial entity

DEPRECATED: spatial entity, non-locational

object (3.7.4.5) that is situated at a unique location (3.7.5.3) for some period of time, and typically has the potential to undergo translocation

Note 1 to entry: A non-locational spatial entity, tagged <entity>, as defined, is distinct from genuine spatial entities that consist of three types of locational entities, places (3.7.5.2), paths (3.7.5.3.2), and event-paths (3.7.5.3.2.1). It is an object that participates in a spatial or motional relation. In John is sitting in a car, both John and car could be understood as spatial entities or as being the figure (3.7.5.7) and the ground (3.7.5.8), respectively, of the sitting-in situation.

figure

entity (3.7.3.4) that is considered the focal object (3.7.4.5), which is related to some reference object

ground

landmark

entity (3.7.3.4) that acts as reference for a figure (3.7.5.7)

Note 1 to entry: “landmark” is often used by cognitive semanticists.

orientational relation

orientation relation

directional relation

link that relates one location (3.7.5.3) as a figure (3.7.5.7) to another location as a ground (3.7.5.8) that expresses the spatial disposition or direction of a spatial object within a frame of reference

measure

magnitude of a spatial dimension or relation

measure relation

link that relates a measure (3.7.5.10) to an object (3.7.4.5) that is being measured

Note 1 to entry: The bounds of a measured object are sometimes specified for a measure relation. They can be points or areas like a city, or lines like a river or mountain range.

motion

motion-event

action or process involving the translocation of a spatial object, transformation of some spatial property of an object (3.7.4.5), or change in the conformation of an object

Note 1 to entry: A motion is a particular kind of event (3.7.1.1).

mover

moving object

entity (3.7.3.4) that undergoes a change of its location (3.7.5.3)

Note 1 to entry: A mover can either be the agent of a motion (3.7.5.12) as one who walked to the station or one that is simply caused to move like a stone thrown into a well, while the thrower is not considered to be the mover in the sense of the term defined.

movement relation

link that relates a mover (3.7.5.13) to an event-path (3.7.5.3.2.1) which the mover traverses

Note 1 to entry: A movement relation is triggered by a motion (3.7.5.12).

spatial relation

segment or series of segments of a text (3.16.9) that rebounds to qualitative spatial relations (3.7.5.5) or orientational relations (3.7.5.9), or to movement relations (3.7.5.14) indirectly through the specification of the bounds of paths (3.7.5.3.2) or event-paths (3.7.5.3.2.1)

extent

textual segment that is a string of character segments in text (3.16.9) that is being annotated (3.2.5)

EXAMPLE Tokens (3.4.5), words (3.1.9.1), and non-contiguous phrases (3.1.25) (e.g. a complex verb like "look ... up").

tag

element name

name associated with textual segments for annotation (3.2.7) or for a relation between these segments

Note 1 to entry: The following are three kinds of tag for annotation:

extent tag, which is associated with textual segments referring to basic entities (3.7.3.4) or signals;
link tag, for representing spatial relations (3.7.5.15); and
root tag, for the closure of annotations (3.2.7).

non-consuming tag

non-consuming tag (3.7.5.17.1) that has no associated extent (3.7.5.16)

EXAMPLE In an example, John ate an apple but Mary a pear, there are at least two ways of marking up the <event> tag, one with its extent or target filled in with a nonnull string of characters, or audio or visual elements, and the other with an empty string:

John ate_e1 an apple, but Mary ∅_e2 a pear;
<event xml:id="e1" target="ate"/>
<event xml:id="e2" target=" "/> (non-consuming <event> tag)

Note 1 to entry: The extent of a non-consuming tag is a null string.

3.1.12 Semantic Relations in Discourse, Core Annotation Schema (DR-core)

situation

event (3.7.1.1), fact, proposition, condition, belief or dialogue act (3.7.2.7), that can be realized by a linguistically simple or complex expression

Note 1 to entry: An expression can be, among others, a clause (3.1.26), a nominalization, a sentence (3.1.27)/ utterance (3.7.2.2), or a discourse segment consisting of multiple sentences or utterances.

discourse connective

word (3.1.9.1) or multiword expression (3.1.9.2) expressing a discourse relation (3.7.6.3)

EXAMPLE Single-word discourse connectives include “but”, “since”, “and”, “however”, “because”. Multi-word discourse connectives include “as well as”, “such as”.

Note 1 to entry: Many of the words that can be used as discourse connectives can also be used as intra-clausal conjunctions, as with the use of “and” in “John and Mary are a lovely couple”.

discourse relation

relation between two situations (3.7.6.1) mentioned in a discourse (3.7.4.1)

EXAMPLE 1 “Peter came late to the meeting. He had been in a traffic jam.” The events mentioned in the two sentences are implicitly related through the discourse relation Cause.

EXAMPLE 2 “Peter was in a traffic jam, but he arrived on time for the meeting.” The events mentioned in the two clauses are related by the discourse relation Concession, expressed by the connective “but”.

EXAMPLE 3 “Peter did not manage to come to the meeting; he was held up in a terrible traffic jam.” The causal relation in this example is the same as in Example 1, but the argument expressed by the first clause is not an eventuality, but a proposition, formed by an event description with negative polarity.

Note 1 to entry: Quasi-synonyms for “discourse relation”, with small variations in meaning, are “coherence relation” and “rhetorical relation (3.7.2.7.7)”.

low-level discourse structure

representation of discourse structure (3.7.4.2) that only specifies local dependencies between a discourse relation (3.7.6.3) and its arguments, without further specifying any links or dependencies across these local structures

3.1.13 Reference Annotation Framework (RAF)

communicative segment

elementary portion of a multimodal interaction

referring expression

communicative segment (3.7.7.1) that specifically designates an entity (3.7.3.4) or an event (3.7.1.1), whether concrete or abstract, discourse (3.7.4.1) new or old, real or fictional

referent

discourse entity

extra-linguistic entity (3.7.3.4) which is denoted, or pointed out, by a communicative segment (3.7.7.1)

Note 1 to entry: “discourse entity” is used preferably in the context of the description of the concrete syntax whereas “referent” is used in the abstract syntax, but also when the underlying process is implied by the expression.

reference

〈semantic annotation〉 relation between a referring expression (3.7.7.1.1) and a referent (3.7.7.2) denoted by it

Note 1 to entry: The verb “to refer to” expresses such a relation: if there is a reference relation between an expression x and a discourse entity e, then x is said to refer to e.

anaphor

linguistic mechanism by which the interpretation of a referring expression (3.7.7.1.1) depends on another expression mentioned in the same text (3.16.9) or discourse (3.7.4.1)

Note 1 to entry: The notion of anaphora is more general than that of coreference (3.7.7.5): the interpretation of anaphora is context-dependent, whereas coreference is determined rather rigidly independently of its possible use of context (see ^[26]).

coreference

identity of referents (3.7.7.2) of two referring expressions (3.7.7.1.1)

objectal relation

relation between two referents (3.7.7.2) reflecting their intended association from a referential point of view

Note 1 to entry: The referential association can identify that they are identical, disjoint, or overlapping, or that one includes the other (see ^[27] and ^[26]).

3.1.14 Visual Information

voxicon

lexicon (3.2.1.1) or list of basic visual object concepts of VoxML (visual object concept structure modelling language)

voxeme

basic entries in voxicon (3.7.8.1)

minimal embedding space

MES

three-dimensional (3D) region (3.7.5.1) within which the state is configured, or the event (3.7.1.1) unfolds

habitat

representation of an object situated within a partial minimal model

qualia

qualia structure

relational forces or aspects of a lexical item (3.2.3) or concept (3.12.1.3)

telic

purpose or function qualia (3.7.8.5) of an object (3.7.4.5)

affordance

affordance structure

set of specific actions, described along with the requisite conditions, that the object (3.7.4.5) can take part in

Gibsonian affordance

set of specific actions that an agent can perform with an object (3.7.4.5) that is presented to the agent

EXAMPLE Hold, grasp, move.

telic affordance

set of goal-oriented or intentionally situated actions of an agent on an object (3.7.4.5) presented to the agent

EXAMPLE An agent eating an apple when it is presented to the agent.

3.1.15 Measurable Quantitative Information (MQI)

quantity

property of a measurable object (3.7.4.5) referring to its magnitude or multitude

base quantity

quantity (3.7.9.1) in a conventionally chosen subset of a given system of quantities, where no quantity in the subset can be expressed in terms of the other quantities within that subset

Note 1 to entry: Kinds of quantities include seven base quantities defined by the International System of Quantities (ISQ).

derived quantity

quantity (3.7.9.1) in a system of quantities, defined in terms of the base quantity (3.7.9.1.1) of that system

EXAMPLE Speed is a derived quantity defined by length (distance) over time (LT^-1), where length (L) and time (T) are base quantities.

[SOURCE: ISO/IEC Guide 99, 1.5, modified — Example replaced.]

quantitative information

measurement associated with the quantity (3.7.9.1) of a measurable object

measurable quantitative information

MQI

quantitative information (3.7.9.2) that can be expressed in unitized numeric terms

quantitative markup language

QML

measurable quantitative information markup language

markup language of measurable quantitative information

specification language for the annotation (3.2.7) of measurable quantitative information (3.7.9.2.1) extractable from text (3.16.9) or other medium types of language (3.16.1)

measurement unit

unit of measurement

unit

scalar basis, defined and adopted by convention, of measuring objects by multiplying their quantitative values expressed in real numbers

Note 1 to entry: The expressions that are used in measurement such as “metre”, “litre”, and “µmol/kg” are units by the definition given above. The multitude expressions such as “bottles”, “boxes”, or “two” as in “two bottles of milk”, “a box of apples”, and “two coffees” sometimes fail to be regarded as units, but they can also be if they are accepted as units by convention or agreement in some communities.

base unit

measurement unit (3.7.9.4) that is adopted by convention for a base quantity (3.7.9.1.1)

Note 1 to entry: There are seven base units chosen by the International System of Units (SI) associated with seven ISQ base quantities to measure quantities, as shown in Table 1.

Table 1 — Base units

SI base unit (unit symbol)	Associated ISQ base quantity (base quantity symbol)
metre (m)	length (L)
kilogram (kg)	mass (M)
second (s)	time (T)
ampere (A)	electric current (I)
kelvin (K)	thermodynamic temperature (È)
mole (mol)	amount of substance (N)
candela (cd)	luminous intensity (J)

derived unit

measurement unit (3.7.9.4) for a derived quantity (3.7.9.1.2)

EXAMPLE The unit “newton” (N) is a derived unit for a derived quantity “force” (F), which is defined to be “mass times acceleration” (MLT^-2), where the quantity (3.7.9.1) “acceleration” is a derived quantity defined by “velocity divided by time” (VT^-1) and “velocity” defined by “length (distance) divided by time” (LT^-1).

Note 1 to entry: Table 2 illustrates some of the derived units.

Table 2 — derived units

Derived unit (unit symbol)	Associated derived quantity
kilometre per minute(km/min)	speed = length(L)/ time(T)
gram per cubic metre (gram/m³)	density = mass(M)/volume(L³)
kilogram metre per square second (kg x m/s²)	force = mass (M) x length(L)/time(T²)
lumen per square metre (lm/m²)	Illuminance = luminous intensity (J)/area(M²)

3.1.16 Quantification

event set

aspect of a quantification (3.7.10.8), specifying a set of events (3.7.1.1) in which the members of a certain participant set (3.7.10.1.1) are involved

participant set

set of entities (3.7.3.4) involved in the event set (3.7.10.1) of a quantification (3.7.10.8)

EXAMPLE The parents gave all the teachers a present.

definiteness

language-dependent morphosyntactic feature (3.1.15) of a noun phrase (NP) (3.1.25.2), marked in English and other European languages (3.16.1) by a definite or indefinite article or a nominal suffix, by a demonstrative, or by a possessive expression

Note 1 to entry: The definiteness feature has two possible values: “definite” and “indefinite”. Being definite is often regarded as an indication of determinacy (3.7.10.4), indefinite as an indication of indeterminacy.

Note 2 to entry: In some languages it is only possible to express that an NP is definite (NPs are by default indefinite) or to express that an NP is indefinite (NPs are by default definite).

EXAMPLE al (definite article in Arabic languages), -e (suffix as definite article in Farsi), el/la (definite article in Spanish), a/az (definite article in Hungarian, there is no indefinite article), yī (occasionally indefinite article in Chinese; there is no definite article and the definiteness is definite unless an indefinite article or the context indicates otherwise).

Note 3 to entry: For overviews of definite expressions, see ^[29] and ^[30].

definite description

singular noun phrase (3.1.25.2) with definiteness (3.7.10.2) ‘definite’, interpreted as referring to a (contextually) uniquely determined entity (3.7.3.4)

EXAMPLE Jimmy, the chairperson, my house, this idea.

determinacy

semantic property of referring to some particular and determinate entity or collection of entities (3.7.3.4)

Note 1 to entry: Determinacy can be interpreted as specifying the relation between the reference domain (3.7.10.10) and the source domain (3.7.10.11) of a quantification (3.7.10.8). The reference domain of a determinate quantification is a proper subset of the source domain; for an indeterminate quantification the reference domain coincides with the source domain.

Note 2 to entry: Determinacy and definiteness (3.7.10.2) are not always clearly distinguished in the linguistic literature. For a discussion of this issue, see ^[31].

distributivity

distribution

specification of whether the entities (3.7.3.4) of the reference domain (3.7.10.10) of a quantification (3.7.10.8) are individually involved, or as a group (collectively), or as a mixture of the two

Note 1 to entry: Distributivity can be expressed by adverbs, such as “together”, “ensemble” (French) and “samen” (Dutch), or by certain determiners, such as “each” in English, “chaque” in French and “jeder” in German. Some determiners, such as the English “each”, “all” and “both” can also be used as adverbs.

exhaustivity

semantic property of a quantification (3.7.10.8), indicating that no other individuals than the elements of the participant set (3.7.10.1.1) are involved in elements of the event set (3.7.10.1)

genericity

specification of whether the sentence in which a quantification (3.7.10.8) occurs refers to a certain specific event set (3.7.10.1) and participant set (3.7.10.1.1) or expresses a general statement or question

quantification

application of a predicate to a set of entities (3.7.3.4)

Note 1 to entry: A particularly important type of predicate in the context of this document is involved in certain events in a certain semantic role.

individuation

semantic property of the way a nominal expression is used to refer to its denotation as a collection (3.3.2) of individual entities (3.7.3.4), as parts of a homogeneous mass, or as a collection of individual entities and their parts

Note 1 to entry: The distinction between referring to a collection of entities and referring to a part-whole structured domain is expressed in many languages by the distinction between count terms and mass terms (3.7.10.12).

reference domain

contextually determined set of entities (3.7.3.4) to which a quantifying predicate (3.7.3.2) is applied

source domain

explicitly mentioned maximal set of entities (3.7.3.4) to which a quantifying predicate (3.7.3.2) is applicable

Note 1 to entry: For a quantifier expressed by a noun phrase (3.1.25.2), the source domain is the extension of the restrictor (3.1.25.2.2). Adverbial temporal and spatial quantifiers have their source domains (temporal and spatial entities), specified as part of their lexical semantics.

mass term

noun or nominal compound used in such a way that it does not individuate its reference (3.7.7.3)

Note 1 to entry: Typical examples in English are “footwear”, “water”, “cattle”, “music”, “luggage” and “furniture”. By contrast, expressions such as “shoe”, “drop of water”, “cow”, “sonata”, “suitcase” and “chair” are typically used as count terms, i.e. in such a way that it is understood what counts as (for example) one shoe, as two shoes, etc. Some words are commonly used either way, such as “rope” and “stone”. The two possible uses of nouns are also illustrated by: “There’s no chicken in the pen”/“There’s no chicken in the stew.” See also ^[16].

inverse linking

modification of a noun phrase head (3.1.25.2.1) that contains a quantifier with wider scope than the quantification (3.7.10.8) of the noun phrase head

EXAMPLE Two students from every university participated in the meeting.

3.1.17 Spatial Semantics

annotation structure

information structure created by marking up some linguistic expressions with relevant (semantic) information

Note 1 to entry: ISO 24617-7, for instance, creates such annotation structures by marking up place names or motions and their spatial relations with relevant spatial information.

semantic form

logical form

representation of the semantic content (3.7.2.7.8) of an annotation structure (3.7.11.1) of expressions in natural language

Note 1 to entry: The semantic form of an annotation structure a is represented by σ(a), where σ is a function that maps an annotation structure a to a semantic form that carries the semantic content of a.

Note 2 to entry: Semantic forms are often called “logical forms” because semantic forms are represented by a logical language such as first-order logic (3.7.11.4).

interpretation

〈spatial semantics〉 function that maps a semantic form (3.7.11.2) to its denotation

Note 1 to entry: The interpretation function is represented by ⟦ ⟧ and, for each semantic form a, its denotation or the value of the interpretation, is represented by ⟦σ(a)⟧.

Note 2 to entry: In a model-theoretic semantics, the interpretation function ⟦ ⟧ is constrained by a model M (3.7.11.5) and, for each semantic form a and a model M, such an interpretation is represented by ⟦σ(a)⟧M.

first-order logic

formal language (3.16.1), artificially built for reasoning, with the values of its terms, particularly variables, ranging over individual objects (3.7.4.5) only

Note 1 to entry: Second-order variables such as P, which ranges over properties of an individual, are temporarily introduced to allow the λ-operation in the process of deriving semantic forms (3.7.11.2).

model M

set-theoretical construct that represents part of the real or possible world denoted by semantic form (3.7.11.2)

eigenplace

eigenspace

region (3.7.5.1) or path (3.7.5.3.2) occupied by an object (3.7.4.5)

Note 1 to entry: A region may be considered as a particular finite path matching to an interval [x,x] such that its start and endpoint match or are identical. In that case, a region is considered as a point.

3.1.18 Measurable Quantitative Information Extraction (MQIE)

information extraction

process of identifying specific structured information from natural language (3.16.1.1), semi-structured text (3.16.9) and/or other electronic text sources

measurable quantitative information extraction

MQIE

process of identifying measurable quantitative information (3.7.9.2.1) from natural language, semi-structured text (3.16.9) and/or other electronic text sources

normalization

process that represents objective information with a formal and/or regular format or converts the information into a consistent value range

Note 1 to entry: The normalization objectives may contain information of entities (3.7.3.4), measure units and quantities (3.7.9.1).

3.1.19 Metamodel

metamodel

schematic representation of the concepts (3.12.1.3) that are used in the analysis and description of the phenomena covered in annotation (3.2.7) and of the relationships between them

3.2 Comprehensive Annotation Framework (ComAF)

segment

〈comprehensive annotation framework〉 referenceable part of a Diagrammatic Semantic Authoring (DSA)-based document, which is either a graph segment or a data segment (text (3.16.9), image, audio, video, etc.)

hypernode

node (3.5.5.1) which is a graph segment

semantic authoring

composition of documents while making their logical structures explicit

3.2.1 Lexical Markup Framework (LMF)

natural language processing

NLP

computer science field covering knowledge and techniques involved in the processing and analysis of linguistic data by a computer

data category

class of data items that are closely related from a formal or semantic point of view

EXAMPLE /part of speech/, /subject field/, /definition/.

Note 1 to entry: A data category can be viewed as a generalization of the notion of a field in a database.

Note 2 to entry: In running text (3.16.9), such as this document, data category names are enclosed in forward slashes (e.g. /part of speech/).

[SOURCE: ISO 30042, 3.8, modified — admitted term “DC” added.]

grammatical feature

property associated with a word form (3.1.13) to describe one of its grammatical attributes

EXAMPLE grammaticalGender.

orthography

systematic way of spelling or writing lexeme (3.1.9) that conforms to a conventionalized use

Note 1 to entry: Usually, the notion of orthography covers standardized spellings of alphabetic languages, such as standard UK or US English, or reformed German spelling, as well as hieroglyphic or syllabic writing systems.

onomasiology

approach to the investigation of word meaning which takes a given concept (3.12.1.3) as a starting point and studies the different lexical items (3.2.3) in a language (3.16.1) or languages that are used to refer to it

etymology

origin and historical development of any aspect of a given lexical item (3.2.3)

etymologizable

meeting the conditions for having an etymology (3.9.6)

Note 1 to entry: "Etymologizable" is a category of lexical elements and usages (encompassing for instance lexical entries (3.2.2), word senses (3.1.12), word forms (3.1.13)).

etymon

Lexical entry (3.2.2) from which another lexical entry is derived

Note 1 to entry: An etymon can also be simply an earlier stage of a lexical item (3.2.3).

cognate

form in a related language (3.16.1) which shares a common etymological origin as a form in the language of the lexicon (3.2.1.1)

syntactic behaviour

one of the possible alternations that a lexeme (3.1.9) can show, at the syntactic level

EXAMPLE A verb can have different types of syntactic behaviours for subcategorization frame (3.6.7) alternations, such as the active voice, the passive voice, reflexive, etc.

Note 1 to entry: A syntactic behaviour is described in terms of subcategorization frames (^[34], ^[35]).

semantic argument

formal semantic unit that is an essential constituent of a predicate-argument structure and can have variable instantiations depending on the utterance (3.7.2.2)

semantic predicate

formal semantic unit that represents a semantic relation between one or more semantic predicates (3.9.12) in a predicate-argument structure

3.2.2 Multilingual Information Framework

adornment

data category (3.9.2) attached to a component of a metamodel (3.7.13.1)

inline code

inline instructions inserted in a source document

Note 1 to entry: Native code can, for instance, provide presentational instructions (e.g. HTML codes).

subtitle

textual versions of the dialog in films, television programs, video games, etc.

Note 1 to entry: Subtitles are usually displayed at the bottom of the screen.

working language

language (3.16.1) in which linguistic sequences are expressed

3.2.3 Persistent Identification and Sustainable Access (PISA)

3.2.4 Resources

resource

〈persistent identification〉 digital object on the web with a specific identity that can be addressed with a Uniform Resource Identifier (3.11.2.1.2)

Note 1 to entry: A resource can have several representations. Depending on the PID framework (3.11.2.2), identification of a specific representation can be encoded in the identifier (3.11.2.1) or be left to the content negotiating process (^[36]) between the web client (3.11.3.5) that uses the resolved PID (3.11.2.1.1) to fetch the resource and the resource server (3.11.3.2).

[SOURCE: Adapted from IETF RFC 3986.]

language resource

resource (3.11.1.1) that provides information about one or more languages (3.16.1)

Note 1 to entry: Language resources cover lexicographical, terminological, morpho-syntactical, corpus-related, or semantic resources or digital resources used to study linguistic phenomena like texts (3.16.9) and multimedia/multimodal recordings. They are created and used by linguists, information specialists, lexicographers and terminologists, among others. They frequently comprise many small records (3.12.2.2) compiled within a larger work, and are often authoritative in nature, such as standardized terminologies and glossaries issued by standards bodies such as ISO, IETF, W3C, etc.

complex resource

resource (3.11.1.1) consisting of multiple constituent parts, each of which can be accessed individually

Note 1 to entry: A complex resource can be a federated resource if its constituent parts are distributed over different digital repositories (3.11.1.3).

abstract resource

non-network-retrievable resource (3.11.1.1) identified by a Uniform Resource Identifier (3.11.2.1.2)

Note 1 to entry: It is practice, for example in RDFS (RDF Schema) or OWL (web ontology language) ontologies, to identify abstract resources using Uniform Resource Identifiers (URIs) (3.11.2.1.2). Web architecture does not require any information resource to be retrievable with this kind of URI. If an identifier (3.11.2.1) for an abstract resource is not meant to be dereferenced (3.11.4.3), such as can be the case with an XML namespace (3.12.4.5) URI, it is not meaningful to issue a PID (3.11.2.1.1) for this resource.

Note 2 to entry: Abstract resources are usually concepts such as a class or property.

version

particular form or variation of a resource (3.11.1.1) that differs from other instantiations of the resource in at least one aspect or item of information

Note 1 to entry: Versions are often identified in sequential order (e.g. Version 1, 2, etc.), but version identification of dynamic resources subject to frequent change is often achieved by assigning a date-time stamp.

digital repository

repository

facility that provides reliable access to managed digital resources (3.11.1.1)

digital archive

3.2.5 Identifiers

identifier

digital identifier

compact sequence of characters associated with digital, non-digital, or abstract entities

Note 1 to entry: Identifiers can apply to entities such as books, images, reports, metadata records (3.12.2.2.1), and events.

PID

persistent identifier

unique identifier (3.11.2.1) that ensures permanent access for a digital object by providing access to it independently of its physical location or current ownership

Note 1 to entry: Unique in this context means that the PID will not be issued again for other resources (3.11.1.1). However, the same PID can reference different representations or resource collection incarnations (3.11.1.5) at the discretion of the resource provider (3.11.3.1).

Uniform Resource Identifier

URI

sequence of characters that identifies a resource (3.11.1.1)

Note 1 to entry: IETF RFC 3986 defines the generic URI syntax and a process for resolving (3.11.4.1) URI references (3.11.1.10) that might be in relative form, along with guidelines (3.18.1.8) and security considerations for the use of URIs on the Internet.

actionable identifier

Uniform Resource Identifier (URI) (3.11.2.1.2) that has a resource-associated identifier (3.11.2.1) that is suitably encoded, such that when the URI is embedded in a web document and “clicked” on, the browser will be redirected to the resource (3.11.1.1), and possibly supplementary services related to the resource

Note 1 to entry: This functionality implies that the URI points to a suitable resolver proxy (3.11.3.8).

Note 2 to entry: In some PID framework (3.11.2.2), the PIDs (3.11.2.1.1) are URIs and are automatically actionable.

fragment identifier

identifier (3.11.2.1) used to reference a resource part (3.11.1.6) in a web context

Note 1 to entry: A fragment identifier component as defined in IETF RFC 3986 is indicated by the presence of a number sign (“#”) character and terminated by the end of the URI (3.11.2.1.2). Fragments (3.11.1.7) in the sense of this RFC are resolved (3.11.4.1) and retrieved from the resource (3.11.1.1) by the local client application (3.11.3.4).

Note 2 to entry: There is a W3C draft proposal to change this handling of fragments (^[38]).

[SOURCE: Adapted from IETF RFC 3986.]

resource part identifier

string of characters that refers to a resource part (3.11.1.6), and which can be identified by some means within a given resource type

EXAMPLE Such means are time for a media file, area for an image or record (3.12.2.2) in a data stream.

PID framework

scheme for specifying identifier strings [PID (3.11.2.1.1) scheme] for web-accessible digital objects together with a mechanism that enables the resolution of these identifiers (3.11.2.1) into the object's current Uniform Resource Identifiers (3.11.2.1.2)

Note 1 to entry: A PID framework in the sense of this International Standard facilitates access to both individual objects and to resource parts (3.11.1.6) and fragments (3.11.1.7) contained in such objects. A PID framework can be solely dependent on existing web resolution protocols or it can entail the interaction of proxy-based resolvers (3.11.3.7).

Note 2 to entry: A PID framework in the sense of this International Standard also allows resolution of other information associated with the PID.

URI naming scheme

top level of the Uniform Resource Identifier (URI) (3.11.2.1.2) naming structure

Note 1 to entry: Every scheme specifies its own syntax conventions for URIs.

Note 2 to entry: Typical URI schemes include http, https, ftp, mailto, etc. and are registered with IANA.

3.2.6 Roles, Institutions and Services

resource provider

organization that makes a resource (3.11.1.1) available online

Note 1 to entry: A resource can also be a service.

resource server

computer that ultimately provides access to the object referenced by a specific client application (3.11.3.4) request

archiving institution

institution responsible for maintaining a digital archive (3.11.1.3.1)

client application

software application that accesses a remote service usually on another computer system

web client

client application (3.11.3.4) capable of accessing resources (3.11.1.1) on the web using the HTTP protocol

resolution system

system designed to support the submission of a PID (3.11.2.1.1) to a network service in order to receive in return one or more pieces of current information related to the identified object

Note 1 to entry: The information can include, among others, a location (URI (3.11.2.1.2)) of the object or metadata (3.12.2.1).

PID resolver

resolver

software application that translates a PID (3.11.2.1.1) into another more suitable identifier, that is a software application that translates a resource PID into its Uniform Resource Identifier (3.11.2.1.2) and in this way points a client application (3.11.3.4) to the location of the resource (3.11.1.1)

HTTP resolver proxy

resolver proxy

application that implements a service supporting the use of urlified (3.11.4.2)PIDs (3.11.2.1.1) to access resources (3.11.1.1) or other PID-related information, or both

3.2.7 Actions

resolve

translate an identifier (3.11.2.1) into another name or address suitable for accessing a resource (3.11.1.1)

Note 1 to entry: The resolution process may require multiple steps in order to obtain a suitable address for a resource.

urlify an identifier

encode an identifier (3.11.2.1) as a suitable Uniform Resource Identifier (3.11.2.1.2)

Note 1 to entry: For example, this might be done with the purpose of creating an actionable identifier (3.11.2.1.2.1).

dereference

access the value referred to by a reference (3.11.1.10)

Note 1 to entry: When used within the context of dereferencing a URI (3.11.2.1.2), it means obtaining a representation of the resource (3.11.1.1) to which the URI points.

3.3 Infrastructure for Component Metadata

3.3.1 General Terms

registry

central directory designed for the persistent provision of negotiated information that can be reliably accessed

Note 1 to entry: A registry can be a software service that allows registering and for the registry to be queried for information.

metadata component registry

component registry

registry (3.12.1.1) of metadata components (3.12.2.6) and metadata profiles (3.12.2.7) for their sharing

semantic registry

directory of (authoritative) definitions of term (3.12.1.4), concept (3.12.1.3) or data category (3.9.2)

Note 1 to entry: These registries generally also provide persistent identifiers (3.11.2.1.1) for their entries.

concept registry

semantic registry (3.12.1.2) maintaining concepts (3.12.1.3)

EXAMPLE The CLARIN Concept Registry (^[39]) as used in the CLARIN infrastructure.

concept

unit of knowledge created by a unique combination of characteristics

[SOURCE: ISO 1087, 3.2.7, modified — Note 1 to entry and Note 2 to entry have been deleted.]

term

designation that represents a general concept (3.12.1.3) in a specific domain or subject

EXAMPLE “planet”, “tower”, “pen”, “numeral”, “number”, “square root”, “logarithm”, “unit of measurement”, “base of a logarithm”, “chemical element”, “chemical compound”, “HP Laserjet 1100”, “Nobel Prize in Physics”.

Note 1 to entry: Terms may be partly or wholly verbal.

Note 2 to entry: Terms can include letters and letter symbols, numerals, mathematical symbols, typographical signs and syntactic signs (e.g. punctuation marks, such as hyphens, parentheses, square brackets and other connectors or delimiters), sometimes in character styles (i.e. fonts and bold, italic, bold italic, or other style conventions) governed by domain-, subject-, or language-specific conventions.

[SOURCE: ISO 1087, 3.2.7]

language tag

textual code used to assist in identifying language (3.16.1) in every mode of communication

Note 1 to entry: This includes constructed and artificial language (3.16.1.5) but excludes languages not intended primarily for human communication, for example in spoken, written, signed, or otherwise signaled, communication (see IETF BCP 47).

Note 2 to entry: Language tags may be used to assist in the identification of a language in every mode of communication, for example in spoken, written, signed, or otherwise signaled, communication.

concept reference

DEPRECATED:

reference (3.11.1.10) to the definition of a concept (3.12.1.3) in a concept registry (3.12.1.2.1)

concept link

reference (3.11.1.10) from a CMD profile (3.12.3.13), CMD component (3.12.3.3), CMD element (3.12.3.4), CMD attribute (3.12.3.5) or a value in a controlled vocabulary (3.12.2.14) to an entry in a semantic registry (3.12.1.2) via a Uniform Resource Identifier (3.11.2.1.2)

Note 1 to entry: Typically a concept link is provided as a persistent identifier (3.11.2.1.1).

media type

DEPRECATED: MIME type

specification used originally for textual, non-textual, multi-part message bodies of emails and which provides technical format information on data

EXAMPLE image/jpeg, image/svg+xml, text/plain, text/html, text/turtle, video/H264, application/xhtml+xml.

Note 1 to entry: There is a description in IETF RFC 6838.

Note 2 to entry: “MIME type” is the older term for “media type”. It is not used in standardization or technical specifications anymore.

Note 3 to entry: Registry of Internet media types is available at: https://www.iana.org/assignments/media-types.

resource collection

collection

〈CMDI〉 grouping of multiple, different constituting elements, each of which is independent of the others and may be accessed individually

Note 1 to entry: A collection can be a virtual collection if its constituent elements come from other different (virtual) collections, and possibly if the elements are distributed over different digital repositories (3.11.1.3).

3.3.2 Metadata

metadata

resource (3.11.1.1) that is a description of another resource, usually given as a set of properties in the form of attribute-value pairs

Note 1 to entry: This description can contain information about the resource, aspects or parts of the resource and/or artefacts and actors connected to the resource.

record

structured information that can be read by software services

metadata record

metadata description

DEPRECATED: metadata

record (3.12.2.2) containing a description of a resource (3.11.1.1)

metadata schema

DEPRECATED: schema

specification of a format and structure for a metadata record (3.12.2.2.1)

Note 1 to entry: In the context of ISO 24622-1, a machine-readable and verifiable format specification usually defined by an XML Schema (3.12.4.6) language.

metadata element

resource property name that can be used in metadata (3.12.2.1) and that can be given a value

Note 1 to entry: A metadata element is referred to as metadata attribute in other communities.

EXAMPLE The DCMI elements (^[42]).

metadata element set

metadata set

resource collection of metadata elements (3.12.2.4) used within a particular discipline, tradition, or practice to describe resources (3.11.1.1)

Note 1 to entry: A metadata set is more general than a metadata schema (3.12.2.3) in that it does not additionally specify the syntax (e.g. the DCMI elements (^[42])).

metadata component

grouping of metadata elements (3.12.2.4) and metadata components (3.12.2.6) that can be used to describe a specific aspect of a resource (3.11.1.1)

EXAMPLE The biographical data of a person or the contact information for an organization.

metadata profile

set of metadata components (3.12.2.6) that can be used together to describe a resource (3.11.1.1) and be transformed into a metadata schema (3.12.2.3)

Note 1 to entry: A metadata profile can be transformed into different metadata schemas that are still logically equivalent (i.e. they give logically equivalent resource descriptions).

metadata element value scheme

value scheme

specification of the value domain of a metadata element (3.12.2.4)

cardinality

specification of the number of occurrences of a metadata component (3.12.2.6) or metadata element (3.12.2.4) in an instantiation

metadata editor

actor that creates metadata records (3.12.2.2.1) to describe specific resources (3.11.1.1)

metadata modeler

actor that creates new metadata schemas (3.12.2.3) for new types of resources (3.11.1.1) or new applications

Note 1 to entry: In ISO 24622-1, metadata schemas are created by producing metadata profiles (3.12.2.7), which in turn form specifications for a metadata schema.

metadata provider

〈organization〉 organization that makes metadata (3.12.2.1) available

metadata provider

〈software service〉 software service that makes metadata (3.12.2.1) available

controlled vocabulary

DEPRECATED: closed vocabulary

DEPRECATED: open vocabulary

〈CMDI〉 set of values that can be used either to constrain the set of permissible values or to provide suggestions for applicable values in a given context

open vocabulary

set of items forming part of the value domain of a metadata element (3.12.2.4) on the recommendation of the metadata modeler (3.12.2.11)

closed vocabulary

limited set of items that forms the mandatory value domain of a metadata element (3.12.2.4)

Unified Modeling Language

UML

language (3.16.1) for specifying, visualizing, constructing, and documenting the artifacts of software systems and abstract models in general

3.3.3 Component Metadata Infrastructure (CMDI)

CMDI

component metadata infrastructure

metadata description framework consisting of the CMD model (3.12.3.2) and infrastructure to process instances of parts of the model

CMD model

component metadata model

metadata model that is based on CMD components (3.12.3.3)

Note 1 to entry: For a specification see ISO 24622-1.

CMD component

component

reusable, structured template for the description of (an aspect of) a resource (3.11.1.1), defined by means of a CMD specification (3.12.3.6) document with the potential of including other CMD components, either through reference or inline definition

CMD root component

CMD component (3.12.3.3) that is defined at the highest level within a CMD profile (3.12.3.13) that may have one or more child CMD components but no siblings

Note 1 to entry: In the CMD instance payload (3.12.3.12), it is instantiated exactly once.

inline CMD component

CMD component (3.12.3.3) that is created and stored within another CMD component and cannot be addressed from other CMD components

CMD element

element definition

unit within a CMD component (3.12.3.3) that describes the level of the CMD instance (3.12.3.9) that can carry atomic values (3.3.1.1) governed by a value scheme (3.12.3.17), and does not contain further levels except for that of the CMD attribute (3.12.3.5)

CMD attribute

unit within a CMD element (3.12.3.4) that describes the level at which properties of a CMD element (3.12.3.4) can be provided by means of value-scheme (3.12.3.17)-constrained atomic values (3.3.1.1)

CMD specification

component specification

component definition

representation of a CMD component (3.12.3.3) or CMD profile (3.12.3.13), expressed using the constructs of the CCSL (3.12.3.18)

CMD specification header

component header

profile header

section of a CMD specification (3.12.3.6) marked as ‘header’, providing information on that CMD specification as such that is not part of the defined structure

CMD component registry

component registry

service where a CMD specification (3.12.3.6) can be registered and accessed

CMD instance

CMDI file

metadata instance

CMDI instance

metadata record

CMD record

file that conforms to the general CMD instance structure and, at the CMD instance payload (3.12.3.12) level, follows the specific structure defined by the CMD profile (3.12.3.13) to which it relates

Note 1 to entry: The general CMD instance structure is described in ISO 24622-2.

CMD instance envelope

section of a CMD instance (3.12.3.9) which is structured uniformly for all instances and contains the CMD instance header (3.12.3.11) and the list of resource proxies (3.12.3.15) which may be referenced from the CMD instance payload (3.12.3.12) section

CMD instance header

section of a CMD instance (3.12.3.9) marked as ‘header’, providing information on that CMD instance as such, not the resource (3.11.1.1) that is described by the metadata file

CMD instance payload

section of a CMD instance (3.12.3.9) that follows the structure defined by the CMD profile (3.12.3.13) it references and contains the description of the resource (3.11.1.1) to which that CMD instance relates

CMD profile

profile

structured template for the description of a class of resources (3.11.1.1) providing the complete structure for a CMD instance payload (3.12.3.12) by means of a hierarchy of CMD components (3.12.3.3)

CMD profile schema

schema definition by which the correctness of a CMD instance (3.12.3.9) with respect to the CMD profile (3.12.3.13) it pertains to can be evaluated

Note 1 to entry: The CMD profile schema can be expressed as an XML Schema (3.12.4.6) but also in other XML schema languages.

resource proxy

CMD resource proxy

DEPRECATED: CMD resource reference

representation of a resource (3.11.1.1) within a CMD instance (3.12.3.9) containing a Uniform Resource Identifier (3.11.2.1.2) as a reference (3.11.1.10) to the resource itself and an indication of its nature

resource proxy reference

reference (3.11.1.10) from any point within the CMD instance payload (3.12.3.12) to any of the resource proxy (3.12.3.15) elements

value scheme

〈CCSL〉 set of constraints governing the range of values allowed for a specific CMD element (3.12.3.4) or CMD attribute (3.12.3.5) in a CMD instance (3.12.3.9), expressed in terms of an XML Schema datatype (3.12.4.9), controlled vocabulary (3.12.2.14), or regular expression (3.12.4.10)

CCSL

CMDI component specification language

XML (3.12.4.1)-based language for describing a CMD component (3.12.3.3) and a CMD profile (3.12.3.13) in accordance with the CMD model (3.12.3.2)

3.3.4 Extensible Markup Language (XML)

XML

markup language for describing hierarchical structures within a text (3.16.9) file

XML document

document represented in XML (3.12.4.1)

XML element

constituent of an XML document (3.12.4.2)

XML container element

XML element (3.12.4.3) that has one or more XML elements as its descendants

XML attribute

property of an XML element (3.12.4.3)

foreign attribute

〈CMDI〉 XML attribute (3.12.4.4) defined in a XML namespace (3.12.4.5) other than those declared in CMDI (3.12.3.1), to be included in a CMD instance (3.12.3.9) as additional information targeted to specific receivers or applications

XML namespace

namespace

method for qualifying element and attribute names used in XML (3.12.4.1)

XML Schema document

XML Schema

document that complies with the XML Schema recommendation, as defined in ^[44]

XML attribute declaration

constituent of an XML Schema document (3.12.4.6) that constrains the structure and content of a specific XML attribute (3.12.4.4)

XML element declaration

constituent of an XML schema that constrains the structure and content of a specific XML element (3.12.4.3)

XML schema datatype

predefined set of permissible content within an XML element (3.12.4.3) or an XML attribute (3.12.4.4) of an XML document (3.12.4.2) used in an XML Schema

regular expression

sequence of characters that denote a set of strings

Note 1 to entry: When used to constrain a lexical space, a regular expression asserts that only strings in the defined set of strings are valid literals for values of that type.

Note 2 to entry: See also ^[45], Appendix F.

3.4 Corpus Query Lingua Franca (CQLF)

CQL

corpus query language

formal language (3.16.1.3) designed to retrieve specific information from (large) language data collections, and thereby incorporate certain abstractions over commonly shared data models that make it possible for the end user (3.13.9) (or user agents) to address parts of those data models

Note 1 to entry: A CQL defines a syntactic notation for query expression (3.13.17) and the corresponding search semantics, i.e. an intensional specification of the intended result set. For most current CQLs, semantics are implicitly defined by a particular implementation.

CQLF implementation

query language that has been analysed with respect to the criteria described by the CQLF Metamodel, and thus has been “located” in the proposed feature matrix as “conformant with CQLF”

CQLF class

top-level division in the CQLF data model

Note 1 to entry: The CQLF Metamodel distinguishes two classes: Single-stream (where the annotation structure (3.7.11.1) is built upon a single data stream, typically a character stream) and Multi-stream (corresponding to e.g. multi-modal corpora or parallel corpora).

CQLF level

part of the matrix of QL properties, defined in terms of the general features of the assumed corpus data models, and consequently the set of properties of a corpus query language (3.13.1) that is used to address these features

Note 1 to entry: The CQLF Metamodel distinguishes three levels of complexity within the Single-stream class: Linear, Complex and Concurrent.

CQLF module

subcomponent of the CQLF metamodel, defined with reference to a specified data-model characteristic

Note 1 to entry: The CQLF metamodel currently distinguishes three modules within CQLF Level 1, Linear (plain-text, segmentation and simple annotation (3.2.7.6)), and three modules within CQLF Level 2, Complex (hierarchical, dependency and containment).

Note 2 to entry: In ISO 24623-2, the containment module is formalized by the concept SpanContainment in order to avoid terminological ambiguity.

concurrent annotations

multiple, potentially conflicting annotation (3.2.7) describing, entirely or partly, the same character span (3.13.7) or an overlapping sequence of character spans

Note 1 to entry: Concurrent annotations may be expected to conflict in several ways: content-wise (with different tags for the same character span), structure-wise (assuming different structural arrangements within the targeted character spans), and also in terms of segment edges (which is typically due to structurally conflicting claims concerning the encompassing character spans). Concurrent annotations typically come from different sources (e.g. tools or human annotators) or result from different settings (e.g. different parsing models or segmentation rules) within a single tool. When encoded in XML (3.12.4.1), concurrent annotations are typically expressed by means of stand-off techniques.

character span

sequence of characters, identified by start and end offsets, to which an annotation (3.2.7) may be applied

Note 1 to entry: Cf. region (3.5.3).

character span containment

relation between character spans (3.13.7) of primary data (3.2.4) in which character span A contains character span B if the initial offset of span A is equal to or higher than that of span B, and the final offset of span A is smaller than or equal to that of span B

Note 1 to entry: The relation of character span containment is used for stating a relationship between two or more character spans or simple annotation (3.2.7.6), without the need to utilize tree-based concepts and mechanisms. Instead of tree traversal, operators such as contains, in or within are typically used for character span containment queries.

end user

agent who uses a CQL (3.13.1) to satisfy his or her search needs (3.13.10)

Note 1 to entry: This can be done via an interactive graphical user interface (GUI), a command-line tool, programmatically via some application programming interface (API) or by a software programme developed by the end user.

search need

information pattern that an end user (3.13.9) wants to locate in a corpus (3.18.1.1), based on the primary data stream and/or simple or complex annotation (3.2.7)

CQL capability

facility provided by CQLs (3.13.1) to meet a specific aspect of search needs (3.13.10)

CQLF ontology

ontology for a fine-grained description of the expressive power of CQLs (3.13.1) in terms of search needs (3.13.10), which adheres to the structure specified in ISO 24623-2

functionality

label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a family of CQL capabilities (3.13.11) contributing to the expressive power of a CQL (3.13.1), formulated at a general level and linked to one or more CQLF module (3.13.5)

frame

label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a typical search need (3.13.10) of end users (3.13.9), understood as one facet of the expressive power of CQL (3.13.1)

Note 1 to entry: Most frames arise from the specialization of a functionality (3.13.13) and/or the combination of multiple functionalities.

use case

label for a concept (3.12.1.3) in a CQLF ontology (3.13.12) that represents a concrete instantiation of a frame (3.13.14), for which it can be determined unambiguously whether a given query expression (3.13.17) satisfies the search need (3.13.10) or not

Note 1 to entry: Use cases are often parameterized, i.e. they contain variable elements. Parameterized use cases are satisfied by parameterized query expressions.

layer

totality of concepts (3.12.1.3) at the same level of abstraction in a CQLF ontology (3.13.12)

EXAMPLE Functionalities (3.13.13), frames (3.13.14), use cases (3.13.15).

query expression

string that is syntactically valid in a given CQL (3.13.1) and can be executed to return a result set

Note 1 to entry: Query expressions are often parameterized with variable elements. No formal specification of the parameter substitution procedure is attempted, but entries for parameterized query expressions in the ontology are required to include informal descriptions of the range of admissible values and any transformations required.

parameter

variable element in a query expression (3.13.17) or in the description of a search need (3.13.10)

positive conformance statement

assertion that a given CQL (3.13.1) supports a given use case (3.13.15) by means of a query expression (3.13.17)

negative conformance statement

assertion that a given CQL (3.13.1) cannot support a given use case (3.13.15), frame (3.13.14) or functionality (3.13.13)

Note 1 to entry: Negative conformance is due to technical unavailability of specific capabilities in the respective CQL or limitations on the complexity of query expressions (3.13.17).

3.4.1 Word Segmentation of Written Texts

word segmentation

process of splitting text (3.16.9) into a sequence of word segmentation units (3.14.2)

word segmentation unit

WSU

word form (3.1.13) or character string of some other type that is treated as a unit

Note 1 to entry: A character string that is not a word form may consist of numeric characters, foreign characters, punctuation marks or some other miscellaneous characters such as Chinese radicals, chemical symbols, such as H₂O, or a mixture of Latin and numeric characters, such as F16.

lexicalization

process of making a linguistic unit function as a word (3.1.9.1)

Note 1 to entry: Such a linguistic unit can be a single morph (3.1.7), e.g. “laugh,” a sequence of morphs, e.g. “apple pie” or even a phrase (3.1.25), such as “kick the bucket”, that forms an idiomatic phrase.

reduplication

process in which the entire word (3.1.9.1), or part of it, is repeated

3.4.2 Transcription of Spoken Language

spoken language

oral language (3.16.1) produced by a person’s vocal system

paralinguistic feature

feature of spoken language (3.15.1) beyond the individual sound(s)

EXAMPLE voice quality, pitch, volume, intonation

transcription system

theoretically founded set of principles and rules detailing what spoken language (3.15.1) phenomena are to be transcribed, and how they are to be transcribed

transcriber

person who carries out the transcription (3.1.30)

orthographic transcription

representation or modelling of spoken language (3.15.1) based on the orthography (3.9.4) of the respective language (3.16.1)

phonetic transcription

representation or modelling of spoken language (3.15.1) based on the sound system of the respective language (3.16.1)

dependent annotation

annotation (3.2.7) which does not refer directly to an audio or video recording, but to another annotation

Note 1 to entry: Typically, a dependent annotation refers to an orthographic transcription (3.15.5) or phonetic transcription (3.15.6).

milestone element

empty XML element (3.12.4.3) used to indicate a boundary point

3.4.3 Controlled Natural Language (CNL) / Controlled Human Communication (CHC)

language

system of signs paired with meanings, thus, being used as a means of conveying information

natural language

language (3.16.1) with its origin unknown, but continuously developing sometimes in idiosyncratic ways as is used conventionally for human communications

simplified language

language (3.16.1) generated through a simplification (3.16.15) process

formal language

language (3.16.1) that has been devised for logical inferences or programming applications with a finite list of symbols and a finite set of formation rules based on these symbols that define well-formed sentences (3.1.27) and also with a system that interprets these sentences

special language

special-purpose language

SPL

language (3.16.1) used in a subject-specific field and also characterized by the use of specific linguistic means of expression

Note 1 to entry: The stricter the conventions of an SPL are systematized and made obligatory, the more they converge with controlled natural language (3.16.2).

artificial language

language (3.16.1) that has been specifically devised for some applications

Note 1 to entry: The grammar of an artificial language is formulated systematically for some specific purposes of its used in practical applications especially in the area of human or human-machine communications.

controlled natural language

CNL

controlled language

subset of natural languages (3.16.1.1) whose grammars and dictionaries have been restricted in order to reduce or eliminate both ambiguity and complexity

Note 1 to entry: As a generic, CNL is an uncountable noun that refers to the abstract properties of all controlled natural languages and not to a particular natural language or application for a specific purpose. It is engineered (i.e. constructed) with a view to reducing or eliminating ambiguity and complexity and aims both to make it easier for human readers [particularly non-native users, non-experts, and people with limited comprehension (3.16.11)] to read a text (3.16.9) and to improve the computational processing of a text.

Note 2 to entry: CNL is an engineered (i.e. constructed) language that is based on a particular natural language, but is more restrictive as regards lexicon, syntax (3.1.24), or semantics, while at the same time preserving most of its natural properties. Here, CNL is a countable noun.

plain language

communication in which wording, structure and design are so clear that the intended readers can easily

find what they need,
understand what they find, and
use that information

[SOURCE: ISO 24495-1, 3.1]

technical communication

process of defining and creating information for use to be delivered as information products for the safe, effective, and efficient use of a supported product throughout its life cycle

[SOURCE: ISO 24183, 3.1.1, modified — Notes 1 to 3 to entry deleted.]

internationalization

process of generalizing a product so that it can handle multiple languages and cultural conventions without the need for re-design

Note 1 to entry: Internationalization takes place at the level of programme design and document development.

localization

process of taking a product and making it linguistically and culturally appropriate to the target locale (country/region and language) where it will be used and sold

Note 1 to entry: The term derives from “locale”: a place where something particular happens or is done. Translation (T9n) is one of the activities in localization.

basic principles and methodology for stylistic guidelines

BSG

guidelines (3.18.1.8) specifying common writing rules applicable to many languages

keyword

word (3.1.9.1) or phrase (3.1.25) used to describe the main content (3.16.10) (nouns and verbs) of a document in a consistent manner

text

data in the form of character arrangements intended to convey a meaning and whose interpretation is essentially based on the knowledge of some natural language (3.16.1.1) or artificial language (3.16.1.5)

[SOURCE: ISO/IEC 2382‑1:1993]

Note 1 to entry: Character arrangements are, among others, characters, symbols, words (3.1.9.1), phrases (3.1.25), paragraphs, sentences (3.1.27) or tables.

content

information content

information contained in or conveyed by a language (3.16.1)

Note 1 to entry: The information can be in written or spoken form or other forms such as images.

comprehension

process of understanding the content (3.16.10) of a document

content management

〈language resource management〉 process of controlling the content (3.16.10) of a text (3.16.9) or the media in general while analysing or revising it

Note 1 to entry: This includes version control of revised documents, contents in versions of similar documents, and the management of relations between items in a document.

authoring

writing a document

Note 1 to entry: Documents include, among others, reports, manuals, articles, or books.

pre-editing

process of modification of a text (3.16.9) before it is submitted to a specific processing

Note 1 to entry: A specific processing can be machine translation.

simplification

process of reducing complexity

Note 1 to entry: Simplified language (3.16.1.2) is the result of a simplification of content (3.16.10).

rewriting

producing a new version of a text (3.16.9) by changing its lexical, sentential, or textual structures while keeping its original content (3.16.10)

re-use

use a document or data for purposes in addition to those for which it was originally designed

Note 1 to entry: Ability to use existing documents for new documents. This includes making a product manual for a new version of the product and one for a similar version.

cooperative work

activity or result of working together to achieve the same goal

Note 1 to entry: Work carried out by more than one person in a collaborative way (e.g. technical communicators and editors putting together a manual).

readability

ease of processing a text (3.16.9) for its comprehension (3.16.11)

tractability

computational tractability

capability of being controlled, analysed, or generated

interoperability

〈language resource management〉 achievement of partial or total compatibility between heterogeneous data models by the mapping of metadata (3.12.2.1)

controlled vocabulary

〈CNL/CHC〉 list of lexical or phrasal items that are selected for the purpose of improving readability (3.16.19) in a particular domain

Note 1 to entry: Most controlled vocabularies target a specific, narrow domain. Unlike controlled natural language (3.16.2), they do not deal with grammatical issues (i.e. how to combine the terms needed to write complete sentences (3.1.27) ), but a good number of CNL approaches, especially domain-specific ones, include controlled vocabularies.

synonym

one of a set of different terms (3.12.1.4) that refer to the same entity

[SOURCE: ISO/IEC 2382:2015, 2121523, modified — The Notes to entry have been removed.]

paronym

word (3.1.9.1) for which the writing or pronunciation is very close to another word, but which has a different lexical meaning

distinctive feature

class of phonetically defined components of phonemes (3.1.2) that function to distinguish meaning

Note 1 to entry: In contrast to redundant features, distinctive features constitute relevant phonological features.

Note 2 to entry: See also ^[49], p. 134.

assimilation

articulatory adaptation of one sound to a nearby sound within a word (3.1.9.1) or at the junction between words with regard to one or more features

Note 1 to entry: See also ^[49], p. 40.

interference

influence of one linguistic system on another in either the individual speaker (3.7.2.3.1.1) or the speech community

Note 1 to entry: See also ^[49], p. 235.

3.4.4 Lexico-Morpho-Syntactic Principles and Methodology for Personal Data Recognition and Protection in Text

seme

Saussure’s signified with its different signifiers (instantiations) in text (3.16.9)

Note 1 to entry: Saussure was the first person to use the terminology “signified” and “signifier”. Saussure offered a “dyadic” or two-part model of the sign. He defined a sign as being composed of a “signifier” (signifiant) and a “signified” (signifié) (see ^[50] and ^[51]).

intension

set of characteristics that make up a concept

[SOURCE: ISO 1087-1:2000, 3.2.9, modified]

indicant

significant occurrence of interaction between lexical, morphological and syntactic phenomena or of one of these phenomena across a wide spectrum of languages (3.16.1) or in few languages or in just one language that is suited to identify personal data (3.17.5)

identifiable natural person

data subject

person who can be identified, directly or indirectly, in particular by reference to an identifier

Note 1 to entry: An identifier can be a name, an identification number, location data or an online identifier of a natural person. Further examples which are excluded from the examples in this document are references to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of the natural person.

[SOURCE: ^[52], Article 4 (1)]

personal data

any information relating to an identified or identifiable natural person (3.17.4)

[SOURCE: ^[52], Article 4 (1)]

processing

any operation or set of operations which is performed on personal data (3.17.5) or on sets of personal data, whether or not by automated means, such as collection, recording, organization, structuring, storage, adaptation or alteration, retrieval, consultation, use, disclosure by transmission, dissemination or otherwise making available, alignment or combination, restriction, erasure or destruction

[SOURCE: ^[52], Article 4 (2)]

pseudonymization

processing (3.17.6) of personal data (3.17.5) in such a manner that the personal data can no longer be attributed to a specific identifiable natural person (3.17.4) without the use of additional information, provided that such additional information is kept separately and is subject to technical and organizational measures to ensure that the personal data are not attributed to an identified or identifiable natural person

[SOURCE: ^[52], Article 4 (5), modified – “data subject“ replaced with most preferred term “identifiable natural person” within definition.]

3.4.5 Corpus Annotation Project Management

3.4.6 Corpus Annotation

corpus

collection (3.3.2) of natural language (3.16.1.1) data

[SOURCE: ISO 1087, 3.6.4, modified — The preferred term “text corpus” deleted. Note 1 to entry deleted.]

corpus annotation

action of adding interpretative linguistic or non-linguistic information to a corpus (3.18.1.1)

[SOURCE: ^[53], modified — “non-linguistic” added.]

annotation scheme

description of the structure of annotation (3.2.7)

annotation layer

layer forannotation (3.2.7) of a corpus

EXAMPLE Syntactic layer, lexical-semantic layer, entity layer.

annotation unit

specific segment of primary data (3.2.4) that is identified and labelled according to an annotation scheme (3.18.1.3)

EXAMPLE Word (3.1.9.1), phrase (3.1.25), clause (3.1.26), sentence (3.1.27), utterance (3.7.2.2).

corpus annotation project

project (3.18.2.1) aimed at enhancing a collection of corpora (3.18.1.1) with metadata (3.12.2.1) or labels that provide additional linguistic, non-linguistic, semantic, or structural information to facilitate analysis, research and the development of natural language processing (3.9.1) tools

resource

〈corpus annotation〉 inputs needed for the establishment, implementation, maintenance and improvement of an organization and its processes s

EXAMPLE People, infrastructure, environment, information, knowledge, suppliers, financial means.

[SOURCE: ISO/IEC/IEEE 24765, 3.3461]

guideline

official recommendation or advice that indicates policies, standards or procedures for how something should be accomplished

[SOURCE: ISO/IEC/IEEE 24765, 3.1774]

3.4.7 Project Management

project

temporary endeavour to achieve one or more defined objectives

[SOURCE: ISO 21502, 3.20]

project management

planning, organizing, monitoring, controlling and reporting of all aspects of a project (3.18.2.1), and the motivation of all those involved in it to achieve the project objectives

[SOURCE: ISO 22886, 3.9.7]

project charter

document that states the problem to be solved, the improvement goals, the project scope (3.18.2.4), the project milestones and the project roles and responsibilities

[SOURCE: ISO 13053-2, 2.26]

project scope

authorized work to accomplish agreed objectives

[SOURCE: ISO 21502, 3.25]

work breakdown structure

WBS

decomposition of the defined scope of a project (3.18.2.1) or programme into progressively lower levels consisting of elements of work

[SOURCE: ISO 21502, modified — Abbreviated term “WBS” added.]

schedule management plan

component of the project management (3.18.2.2) plan that establishes the criteria and the activities (3.18.2.8) for developing, monitoring and controlling the schedule

[SOURCE: ISO/IEC/IEEE 24765, 3.3619]

project phase

collection of logically related project activities (3.18.2.8) that culminates in the completion of one or more deliverables (3.18.2.21)

[SOURCE: ISO/IEC/IEEE 24765, 3.3181]

activity

identified piece of work that is required to be undertaken to complete a project (3.18.2.1), programme, portfolio or other related work

[SOURCE: ISO 21506, 3.2, modified — Note 1 to entry deleted.]

work package

group of activities (3.18.2.8) that have a defined scope, deliverable (3.18.2.21), timescale and cost

[SOURCE: ISO 21502, 3.30]

work package leader

work package team leader

role within project management (3.18.2.2) that is responsible for overseeing a specific work package (3.18.2.9)

process

systematic series of activities (3.18.2.8) directed towards causing an end result such that one or more inputs will be acted upon to create one or more outputs (3.18.2.20)

[SOURCE: ISO/IEC/IEEE 24765, 3.3037]

data validation

process (3.18.2.11) of systematically checking and verifying the accuracy, completeness and consistency of annotations (3.2.7) within the corpus (3.18.1.1) to ensure that the data meet predefined quality standards and guidelines (3.18.1.8)

process group

collection of related processes (3.18.2.11)

[SOURCE: ISO/IEC/IEEE 24765, 3.3057]

project communications management

set of processes (3.18.2.11) that are required to ensure timely and appropriate planning, collection, creation, distribution, storage, retrieval, management, control (3.18.2.22), monitoring and the ultimate disposition of project information

[SOURCE: ISO/IEC/IEEE 24765, 3.3156]

project cost management

set of processes (3.18.2.11) involved in planning, estimating, budgeting, financing, funding, managing and controlling costs so that the project (3.18.2.1) can be completed within the approved budget

[SOURCE: ISO/IEC/IEEE 24765, 3.3158]

project integration management

set of processes (3.18.2.11) and activities (3.18.2.8) needed to identify, define, combine, unify and coordinate the various processes and project management (3.18.2.2) activities within the project management process group (3.18.2.19)

[SOURCE: ISO/IEC/IEEE 24765, 3.3165]

project procurement management

set of processes (3.18.2.11) necessary to purchase or acquire products, services or results needed from outside the project team

[SOURCE: ISO/IEC/IEEE 24765, 3.3185]

project quality management

set of processes (3.18.2.11) and activities (3.18.2.8) of the performing organization that determine quality policies, objectives and responsibilities so that the project (3.18.2.1) will satisfy the needs for which it was undertaken

[SOURCE: ISO/IEC/IEEE 24765, 3.3186]

project scope management

set of processes (3.18.2.11) required to ensure that the project (3.18.2.1) includes all the work required, and only the work required, to complete the project successfully

[SOURCE: ISO/IEC/IEEE 24765, 3.3194]

project management process group

logical grouping of project management (3.18.2.2) inputs, tools and techniques, and outputs (3.18.2.20)

Note 1 to entry: The project management process groups include initiating processes (3.18.2.11), planning processes, executing processes, monitoring and controlling processes, and closing processes. Project management process groups are not project phases (3.18.2.7).

[SOURCE: ISO/IEC/IEEE 24765, 3.3173]

output

aggregated tangible or intangible deliverables (3.18.2.21) that form the project result

[SOURCE: ISO 21502, 3.14]

deliverable

unique and verifiable element that is required to be produced by a project (3.18.2.1)

[SOURCE: ISO 21502, 3.9]

control

comparison of actual performance with planned performance, analysing variances and taking appropriate corrective and/or preventive action as needed

[SOURCE: ISO 21506, 3.13, modified — “and/or” replaced “and”.]

data consistency

adherence to uniform and standardized guidelines (3.18.1.8) and criteria for annotation (3.2.6) (3.2.6) across the entire corpus (3.18.1.1), ensuring that all annotated elements follow the same rules and conventions, which facilitates reliable and reproducible analysis

stakeholder

person, group or organization that has interests in, or can affect, be affected by, or perceive itself to be affected by, any aspect of the project (3.18.2.1), programme or portfolio

[SOURCE: ISO 21502, 3.27]

Bibliography

[1] ISO 10241-1:2011, Terminological entries in standards — Part 1: General requirements and examples of presentation

[2] Bussmann, H. (1996) Routledge dictionary of language and linguistics. London: Routledge.

[3] ISO 19104:2016, Geographic information — Terminology

[4] UD Guidelines (Universal Dependencies). (n.d.). https://universaldependencies.org/guidelines.html

[5] ISO 1087:2019, Terminology work and terminology science — Vocabulary

[6] ISO 15924, Information and documentation — Codes for the representation of names of scripts

[7] ISO/IEC 10646:2020, Information technology — Universal coded character set (UCS)

[8] ISO 24610-1, Language resource management — Feature structures — Part 1: Feature structure representation

[9] ISO 24610-1:2006, Language resource management — Feature structures — Part 1: Feature structure representation

[10] CLAWS-7 tagset. Available at: http://ucrel.lancaster.ac.uk/claws7tags.html

[11] EAGLES guidelines. Available at: https://www.ilc.cnr.it/EAGLES96/browse.html

[12] NKJP tagset. Available at: http://nkjp.pl/poliqarp/help/ense2.html

[13] ISO 15919:2001, Information and documentation — Transliteration of Devanagari and related Indic scripts into Latin characters

[14] Time Ontology in OWL. (2022, November 15). https://www.w3.org/TR/owl-time/

[15] Princeton University "About WordNet." WordNet. Princeton University. 2010. https://wordnet.princeton.edu/

[16] Bunt, H. (1985). Mass terms and model-theoretic semantics. Cambridge University Press.

[17] Hobbs, J. and Pan, F. (2004). An ontology of time for the semantic web. TALIP Special Issue on Spatial and Temporal Information Processing 3 (1) (2004), pp. 66-85

[18] Pustejovsky, J., Saurī, R., Setzer, A. and Ingria, B. (2004). TimeML Annotation Guidelines 1.2, unpublished

[19] ISO 24617-2, Language resource management — Semantic annotation framework (SemAF) — Part 2: Dialogue acts

[20] Goffman E., (1963) Behavior in Public Places. New York: Basic Books

[21] Ahn R. (2001) Agents, Objects, and Events: A computational approach to knowledge, observation, and communication. PhD Thesis, Eindhoven University of Technology.

[22] Speech Act. (2017, February 2). Glossary of Linguistic Terms. https://glossary.sil.org/term/speech-act

[23] Austin J.L, (1962) How to do things with words. Clarendon Press, Oxford, UK.

[24] Bales R.F., (1951) Interaction process analysis: a method for the study of small groups. Addison-Wesley, Cambridge).

[25] Bunt H.C., Palmer M.S. 2013, Conceptual and representational choices in defining an ISO standard for semantic role annotation, In the Proceedings of the ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-9) held in conjunction with the International Workshop on Computational Semantics, Potsdam, Germany, March, 2013.

[26] Kleiber Georges, Patry Richard, Ménard Nathan, 1993. “Anaphore Associative: Dans Quel Sens «roule»-t-Elle?” Revue Québécoise de Linguistique 22 (2): 139–162.

[27] Bunt Harry, Gilmartin Emer, Keizer Simon, Pelachaud Catherine, Petukhova Volha, Prevot Laurent, Theune Mariet, 2018. “Downward Compatible Revision of Dialogue Annotation.” In, 21–34. Santa Fé (New Mexico), USA.

[28] ISO/IEC Guide 99:2007, International vocabulary of metrology — Basic and general concepts and associated terms (VIM)

[29] Abbott, B. (2004) Definiteness and indefiniteness. In: Horn, L., Ward, G. (eds.) Handbook of Pragmatics. Oxford: Blackwell, pp. 122–149.

[30] Winter, Y. Ruys, E. (2011) Scope ambiguities in formal syntax and semantics. In: Gabbay, D., Guenthner, F. (eds.) Handbook of Philosophical Logic (2nd edition). Springer.

[31] Bunt H. (2023). The compositional semantics of QuantML annotations. In: Proceedings 19th Joint ACL-ISO Workshop on Interoperable Semantic Annotation (ISA-19), Nancy, France, pp. 3–13.

[32] ISO 24617-7:2020, Language resource management — Semantic annotation framework — Part 7: Spatial information

[33] ISO 30042:2019, Management of terminology resources — TermBase eXchange (TBX)

[34] Levin, B. English Verb Classes and Alternations: A Preliminary Investigation. University of Chicago Press, 1993.

[35] Sanfilippo, A., Ananiadou, S., Gaizauskas, R., Saint-Dizier, P., Vossen, P., Bel, N., Bontcheva, K. et al. EAGLES LE3-4244 Preliminary Recommendations on Lexical Semantic Encoding Final Report, 1999.

[36] Fielding. R., et al. Hypertext Transfer Protocol — HTTP/1.1, IETF RFC 2616, June 1999.

[37] IETF RFC 3986:2005, Uniform Resource Identifier (URI): Generic Syntax

[38] González R., Suarez Araújo C.P., eds. Proceedings of the 3rd International Conference on Language Resources and Evaluation. Paris: European Language Resource Association. pp. 1321-1326, 2002.

[39] CLARIN Concept Registry. Available at https://www.clarin.eu/ccr/

[40] IETF BCP 47, Tags for Identifying Languages

[41] IETF RFC 6838:2013, Media Type Specifications and Registration Procedures

[42] Dublin Core Metadata Initiative (DCMI). Terminology’. http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/2004-12-08/#sect-7

[43] ISO 24622-1, Language resource management — Component Metadata Infrastructure (CMDI) — Part 1: The Component Metadata Model

[44] XML Schema Part 1: Structures Second Edition. (n.d.). https://www.w3.org/TR/xmlschema-1/

[45] W3C XSD, W3C XML Schema Definition Language (XSD) 1.1 Part 1: Structures Gao S., Sperberg-McQueen C. M, Thompson H. S., (eds.), W3C Recommendation 5 April 2012. Available at https://www.w3.org/TR/xmlschema11-1/

[46] ISO 24623-2, Language resource management — Corpus query lingua franca (CQLF) — Part 2: Ontology

[47] ISO 24495-1:2023, Plain language — Part 1: Governing principles and guidelines

[48] ISO 24183:2024, Technical communication — Vocabulary

[49] Gutehrlé N., Atanassova I., Cardey S., Langue contrôlée pour un système de messages et alertes dans un environnement de mobilité: gestion de l’ambiguïté, international workshop FUTURMOB-17, 5-7 September 2017, Montbéliard, France.

[50] Saussure, F. de. Cours de linguistique générale, 1922. Course of General Linguistics, translated and annotated by Roy Harris, 1990. London: Duckworth.

[51] Chandler, D. Semiotics: The Basics. New York: Routledge, 2017.

[52] Official Journal of the European Union Regulation (EU) 2016/679 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL of 27 April 2016, General Data Protection Regulation Available at [last viewed 2020-04-22]: https://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=EN#d1e1374-1-1

[53] Leech G. Adding Linguistic Annotation [online]. In: Wynne, M. (ed.) Developing Linguistic Corpora: A Guide to Good Practice. Oxford: Oxbow Books, 2005 Available at [accessed 2025-03-11]: https://llds.ling-phil.ox.ac.uk/guides/dlc/chapter2.html

[54] ISO/IEC/IEEE 24765:2017, Systems and software engineering — Vocabulary

[55] ISO 21502:2020, Project, programme and portfolio management — Guidance on project management

[56] ISO 22886:2020, Healthcare organization management — Vocabulary

[57] ISO 13053-2:2011, Quantitative methods in process improvement — Six Sigma — Part 2: Tools and techniques

[58] ISO 21506:2024, Project, programme and portfolio management — Vocabulary

Index

3.12.1.6

abbreviated form 3.1.21

abbreviation 3.1.21

abstract resource 3.11.1.1.3

actionable identifier 3.11.2.1.2.1

activity 3.18.2.8

addressee 3.7.2.3.2

adjunct 3.6.6

admissibility constraint 3.3.16

admissible feature 3.3.15

admissible feature value 3.3.1.3

admissible value 3.3.1.3

ADN 3.1.9.1.3

adnoun 3.1.9.1.3

adornment 3.10.1

affix 3.1.8.2.1

affordance 3.7.8.7

affordance structure 3.7.8.7

agglutination 3.1.18

ALINK 3.7.1.9

allo-feedback act 3.7.2.7.1.1

alternation 3.3.31

anaphor 3.7.7.4

anchor 3.5.4

annotate 3.2.5

annotation 3.2.6

annotation 3.2.7

annotation document 3.5.2

annotation layer 3.18.1.4

annotation scheme 3.18.1.3

annotation structure 3.7.11.1

annotation tier 3.11.1.11

annotation unit 3.18.1.5

appropriate feature 3.3.15

archive 3.11.1.3.1

archiving institution 3.11.3.3

argument 3.7.3.1

artificial language 3.16.1.5

assimilation 3.16.26

atomic type 3.3.22.3

atomic value 3.3.1.1

attribute-value matrix 3.3.8

authoring 3.16.13

auto-feedback act 3.7.2.7.1.2

AVM 3.3.8

bag 3.3.3

base form 3.1.10

base quantity 3.7.9.1.1

base type 3.3.22.2

base unit 3.7.9.4.1

basic principles and methodology for stylistic guidelines 3.16.7

beginning 3.7.1.7.1

borrowing 3.1.22

bound morpheme 3.1.8.2

boxed label 3.3.19

BSG 3.16.7

built-in 3.3.17

bunsetsu 3.1.25.1

canonical form 3.1.10

cardinality 3.12.2.9

CCSL 3.12.3.18

character 3.1.33

character span 3.13.7

character span containment 3.13.8

chunk 3.6.2.1

circumstance 3.7.4.4

citation 3.11.1.9

class 3.7.4.6

clause 3.1.26

client application 3.11.3.4

closed vocabulary 3.12.2.14

closed vocabulary 3.12.2.16

CMD attribute 3.12.3.5

CMD component 3.12.3.3

CMD component registry 3.12.3.8

CMD element 3.12.3.4

CMD instance 3.12.3.9

CMD instance envelope 3.12.3.10

CMD instance header 3.12.3.11

CMD instance payload 3.12.3.12

CMD model 3.12.3.2

CMD profile 3.12.3.13

CMD profile schema 3.12.3.14

CMD record 3.12.3.9

CMD resource proxy 3.12.3.15

CMD resource reference 3.12.3.15

CMD root component 3.12.3.3.1

CMD specification 3.12.3.6

CMD specification header 3.12.3.7

CMDI 3.12.3.1

CMDI component specification language 3.12.3.18

CMDI file 3.12.3.9

CMDI instance 3.12.3.9

CNL 3.16.2

cognate 3.9.9

collection 3.3.2

collection 3.11.1.4

collection 3.12.1.8

communicative function 3.7.2.7.11

communicative segment 3.7.7.1

complex resource 3.11.1.1.2

complex value 3.3.1.2

component 3.12.3.3

component definition 3.12.3.6

component header 3.12.3.7

component metadata infrastructure 3.12.3.1

component metadata model 3.12.3.2

component registry 3.12.1.1.1

component registry 3.12.3.8

component specification 3.12.3.6

compound 3.1.9.1.1

compounding 3.1.20

comprehension 3.16.11

computational tractability 3.16.20

concatenation 3.3.32

concept 3.12.1.3

concept link 3.12.1.6.1

concept reference 3.12.1.6

concept registry 3.12.1.2.1

concurrent annotations 3.13.6

constituent 3.6.2

constraint 3.3.18

content 3.16.10

content management 3.16.12

context 3.7.2.7.10

control 3.18.2.22

controlled language 3.16.2

controlled natural language 3.16.2

controlled vocabulary 3.12.2.14

controlled vocabulary 3.16.22

cooperative work 3.16.18

coreference 3.7.7.5

corpus 3.18.1.1

corpus annotation 3.18.1.2

corpus annotation project 3.18.1.6

corpus query language 3.13.1

CQL 3.13.1

CQL capability 3.13.11

CQLF class 3.13.3

CQLF implementation 3.13.2

CQLF level 3.13.4

CQLF module 3.13.5

CQLF ontology 3.13.12

CV 3.16.22

DAG 3.4.2

data category 3.9.2

data consistency 3.18.2.23

data subject 3.17.4

data validation 3.18.2.11.1

DC 3.9.2

dcl 3.7.5.4

default value 3.3.1.4

definite description 3.7.10.3

definiteness 3.7.10.2

deliverable 3.18.2.21

dependency 3.6.4

dependency annotation 3.2.7.4

dependency relation 3.6.4

dependent annotation 3.15.7

dereference 3.11.4.3

derivation 3.1.19

derived quantity 3.7.9.1.2

derived unit 3.7.9.4.2

determinacy 3.7.10.4

dialogue 3.7.2.1

dialogue act 3.7.2.7

digital archive 3.11.1.3.1

digital identifier 3.11.2.1

digital repository 3.11.1.3

digraph 3.4.2

dimension 3.7.2.7.2

directed acyclic graph 3.4.2

directional relation 3.7.5.9

discourse 3.7.4.1

discourse connective 3.7.6.2

discourse entity 3.7.7.2

discourse relation 3.7.2.7.7

discourse relation 3.7.6.3

discourse structure 3.7.4.2

distinctive feature 3.16.25

distribution 3.7.10.5

distributivity 3.7.10.5

document creation location 3.7.5.4

domain 3.6.8

dynamic path 3.7.5.3.2.1

dynamic route 3.7.5.3.2.1

edge 3.5.5.2

edge 3.6.1.3

eigenplace 3.7.11.6

eigenspace 3.7.11.6

element definition 3.12.3.4

element name 3.7.5.17

empty feature structure 3.3.7.1

end 3.7.1.7.2

end user 3.13.9

ending 3.1.8.2.1.1

entity 3.7.3.4

eojeol 3.1.9.1.4

etymologizable 3.9.7

etymology 3.9.6

etymon 3.9.8

event 3.7.1.1

event set 3.7.10.1

event-path 3.7.5.3.2.1

eventuality 3.7.1.1

eventuality frame 3.7.3.5

eventuality modifier 3.7.3.6

exhaustivity 3.7.10.6

extension 3.3.13

extent 3.7.5.16

feature 3.3.5

feature admissibility constraint 3.3.16

feature specification 3.3.6

feature structure 3.3.7

feature system 3.3.24.1.1

feature system declaration 3.3.25

feature value 3.3.1

feedback act 3.7.2.7.1

feedback dependence relation 3.7.2.7.6

figure 3.7.5.7

finite state automata 3.4.1

first-order logic 3.7.11.4

foreign attribute 3.12.4.4.1

formal language 3.16.1.3

fragment 3.11.1.7

fragment identifier 3.11.2.1.3

frame 3.13.14

free morpheme 3.1.8.1

FSA 3.4.1

FSD 3.3.25

functional dependence relation 3.7.2.7.4

functional segment 3.7.2.7.3

functionality 3.13.13

GA 3.7.8.7.1

genericity 3.7.10.7

Gibsonian affordance 3.7.8.7.1

grammatical category 3.1.23

grammatical feature 3.9.3

grammatical function 3.6.3

graph 3.5.5

graph 3.6.1

graph notation 3.3.9

grapheme 3.1.31

graphic character 3.1.33

ground 3.7.5.8

guideline 3.18.1.8

habitat 3.7.8.4

head 3.1.25.2.1

head 3.6.2.3

hierarchical annotation 3.2.7.5

homograph 3.1.32

homophone 3.1.3

HTTP resolver proxy 3.11.3.8

hypernode 3.8.2

identifiable natural person 3.17.4

identifier 3.11.2.1

IE 3.7.12.1

implicational constraint 3.3.18.1

incarnation 3.11.1.5

incompatibility 3.3.11

indicant 3.17.3

individuation 3.7.10.9

inflected form 3.1.14

inflection 3.1.6.1

information content 3.16.10

information extraction 3.7.12.1

information state 3.7.2.7.10

inline CMD component 3.12.3.3.2

inline code 3.10.2

instant 3.7.1.7

intension 3.17.2

interference 3.16.27

internal part 3.11.1.6.2

internationalization 3.16.5

interoperability 3.16.21

interpretation 3.3.13.1

interpretation 3.7.11.3

inverse linking 3.7.10.13

keyword 3.16.8

label 3.6.9

landmark 3.7.5.8

language 3.16.1

language resource 3.11.1.1.1

language tag 3.12.1.5

layer 3.13.16

lemma 3.1.10

lemmatization 3.1.11

lemmatized form 3.1.10

lexeme 3.1.9

lexical database 3.2.1

lexical entry 3.2.2

lexical item 3.2.3

lexical resource 3.2.1

lexicalization 3.14.3

lexicon 3.2.1.1

linguistic annotation 3.2.7.2

linguistic structure 3.1.1

localization 3.16.6

location 3.7.5.3

logical form 3.7.11.2

low-level discourse structure 3.7.6.4

main clause 3.1.26.1

malmaldi 3.1.9.1.4

markable 3.7.1.8

markup language of measurable quantitative information 3.7.9.3

mass term 3.7.10.12

matrix notation 3.3.8

measurable quantitative information 3.7.9.2.1

measurable quantitative information extraction 3.7.12.2

measurable quantitative information markup language 3.7.9.3

measure 3.7.5.10

measure relation 3.7.5.11

measure word 3.1.23.1

measurement unit 3.7.9.4

media type 3.12.1.7

merge 3.3.27

MES 3.7.8.3

metadata 3.12.2.1

metadata 3.12.2.2.1

metadata component 3.12.2.6

metadata component registry 3.12.1.1.1

metadata description 3.12.2.2.1

metadata editor 3.12.2.10

metadata element 3.12.2.4

metadata element set 3.12.2.5

metadata element value scheme 3.12.2.8

metadata instance 3.12.3.9

metadata modeler 3.12.2.11

metadata profile 3.12.2.7

metadata provider 3.12.2.12

metadata provider 3.12.2.13

metadata record 3.12.2.2.1

metadata record 3.12.3.9

metadata schema 3.12.2.3

metadata set 3.12.2.5

metamodel 3.7.13.1

milestone element 3.15.8

MIME type 3.12.1.7

minimal embedding space 3.7.8.3

MLINK 3.7.1.10

model M 3.7.11.5

modifier 3.6.2.2

morph 3.1.7

morpheme 3.1.8

morpho-syntactic unit 3.1.13

morphology 3.1.6

morphosyntactic feature 3.1.15

morphosyntactic tag 3.4.3

morphosyntactic tagset 3.4.4

motion 3.7.5.12

motion-event 3.7.5.12

movement relation 3.7.5.14

mover 3.7.5.13

moving object 3.7.5.13

MQI 3.7.9.2.1

MQIE 3.7.12.2

multiset 3.3.3

multiword expression 3.1.9.2

MWE 3.1.9.2

namespace 3.12.4.5

natural language 3.16.1.1

natural language processing 3.9.1

negation 3.3.28

negative conformance statement 3.13.20

NL 3.16.1.1

NLP 3.9.1

node 3.5.5.1

node 3.6.1.2

non-consuming tag 3.7.5.17.1

non-locational spatial entity 3.7.5.6

non-terminal node 3.6.1.2.2

normalization 3.7.12.3

noun phrase 3.1.25.2

noun phrase head 3.1.25.2.1

NP 3.1.25.2

object 3.7.4.5

objectal relation 3.7.7.6

onomasiology 3.9.5

open vocabulary 3.12.2.14

open vocabulary 3.12.2.15

orientation relation 3.7.5.9

orientational relation 3.7.5.9

original artefact 3.5.1

orthographic transcription 3.15.5

orthography 3.9.4

output 3.18.2.20

paralinguistic feature 3.15.2

parameter 3.13.18

paronym 3.16.24

part 3.11.1.6

part of speech 3.1.23

partial order 3.3.24

partially ordered set 3.3.24

participant 3.7.2.3

participant set 3.7.10.1.1

particle 3.1.8.2.1.2

path 3.3.10

path 3.7.5.3.2

period 3.7.1.3

persistent identifier 3.11.2.1.1

personal data 3.17.5

phoneme 3.1.2

phoneme confusion 3.1.5

phonetic transcription 3.15.6

phrasal compound 3.1.9.1.2

phrase 3.1.25

PID 3.11.2.1.1

PID framework 3.11.2.2

PID resolver 3.11.3.7

place 3.7.5.2

plain language 3.16.3

point of event 3.7.1.7.3

point of reference 3.7.1.7.4

point of speech 3.7.1.6.1

point of text 3.7.1.7.5

POS 3.1.23

positive conformance statement 3.13.19

pre-editing 3.16.14

predicate 3.7.3.2

predicate argument structure 3.7.3.3

primary data 3.2.4

process 3.18.2.11

process group 3.18.2.12

processing 3.17.6

profile 3.12.3.13

profile header 3.12.3.7

project 3.18.2.1

project charter 3.18.2.3

project communications management 3.18.2.13

project cost management 3.18.2.14

project integration management 3.18.2.15

project management 3.18.2.2

project management process group 3.18.2.19

project phase 3.18.2.7

project procurement management 3.18.2.16

project quality management 3.18.2.17

project scope 3.18.2.4

project scope management 3.18.2.18

pseudonymization 3.17.7

published collection 3.11.1.4.1

QI 3.7.9.2

QML 3.7.9.3

QS 3.7.8.5

qualia 3.7.8.5

qualia structure 3.7.8.5

qualifier 3.7.2.7.12

qualitative spatial relation 3.7.5.5

quantification 3.7.10.8

quantitative information 3.7.9.2

quantitative markup language 3.7.9.3

quantity 3.7.9.1

quasi-homophone 3.1.4

query expression 3.13.17

range restriction 3.3.1.3

re-entrancy 3.3.14

re-use 3.16.17

readability 3.16.19

record 3.12.2.2

reduplication 3.14.4

reference 3.7.7.3

reference 3.11.1.10

reference domain 3.7.10.10

reference segment 3.7.2.7.5

referent 3.7.7.2

referring expression 3.7.7.1.1

region 3.5.3

region 3.7.5.1

registry 3.12.1.1

regular expression 3.12.4.10

relational class 3.7.4.7

repository 3.11.1.3

representation 3.2.8

resolution system 3.11.3.6

resolve 3.11.4.1

resolver 3.11.3.7

resolver proxy 3.11.3.8

resource 3.11.1.1

resource 3.18.1.7

resource collection 3.12.1.8

resource collection incarnation 3.11.1.5

resource part 3.11.1.6

resource part identifier 3.11.2.1.4

resource provider 3.11.3.1

resource proxy 3.12.3.15

resource proxy reference 3.12.3.16

resource server 3.11.3.2

responsive communicative function 3.7.2.7.11.1

restrictor 3.1.25.2.2

rewriting 3.16.16

rhetorical relation 3.7.2.7.7

romanization 3.4.8.1

route 3.7.5.3.2

schedule management plan 3.18.2.6

schema 3.12.2.3

script 3.1.28

script conversion 3.4.7

search need 3.13.10

segment 3.7.4.3

segment 3.8.1

segmentation annotation 3.2.7.1

semantic annotation 3.2.7.3

semantic argument 3.9.11

semantic authoring 3.8.3

semantic content 3.7.2.7.8

semantic content category 3.7.2.7.9

semantic content type 3.7.2.7.9

semantic form 3.7.11.2

semantic predicate 3.9.12

semantic registry 3.12.1.2

semantic role 3.7.3.7

semantic type 3.3.26

seme 3.17.1

sender 3.7.2.3.1

sentence 3.1.27

sequential representation 3.6.10

simple annotation 3.2.7.6

simplification 3.16.15

simplified language 3.16.1.2

situation 3.7.6.1

SLINK 3.7.1.11

snapshot 3.11.1.8

source domain 3.7.10.11

space 3.7.5.3.1

spatial entity, non-locational 3.7.5.6

spatial relation 3.7.5.15

speaker 3.7.2.3.1.1

speaker role 3.7.2.4

special language 3.16.1.4

special-purpose language 3.16.1.4

speech act 3.7.2.5

SPL 3.16.1.4

spoken language 3.15.1

stakeholder 3.18.2.24

stand-off annotation 3.2.7.7

static path 3.7.5.3.2

stem 3.1.17

structure sharing 3.3.14

subcategorization frame 3.6.7

subordinate clause 3.1.26.2

subsumption 3.3.12

subtitle 3.10.3

subtype 3.3.22.1

supertype 3.3.22.2

synonym 3.16.23

syntactic argument 3.6.5

syntactic behaviour 3.9.10

syntactic edge 3.6.1.3

syntactic graph 3.6.1

syntactic head 3.6.2.3

syntactic node 3.6.1.2

syntactic tree 3.6.1.1

syntax 3.1.24

tag 3.7.5.17

technical communication 3.16.4

telic 3.7.8.6

telic affordance 3.7.8.7.2

temporal interval 3.7.1.3

temporal ordering relation 3.7.1.4

temporal unit 3.7.1.6

tense 3.7.1.2

term 3.12.1.4

terminal node 3.6.1.2.1

terminal part 3.11.1.6.1

text 3.16.9

time amount 3.7.1.5

TLINK 3.7.1.12

token 3.4.5

tokenization 3.4.6

topological link 3.7.5.5

tractability 3.16.20

trajectory 3.7.5.3.2.1

transcriber 3.15.4

transcription 3.1.29

transcription 3.1.30

transcription system 3.15.3

transliteration 3.4.8

turn unit 3.7.2.6

type 3.3.22

type 3.3.26

type declaration 3.3.23

type hierarchy 3.3.24.1

typed feature structure 3.3.7.2

typing 3.3.33

UML 3.12.2.17

underspecification 3.3.4

uniﬁcation 3.3.29

Unified Modeling Language 3.12.2.17

Uniform Resource Identifier 3.11.2.1.2

union 3.3.30

unit 3.7.9.4

unit of measurement 3.7.9.4

URI 3.11.2.1.2

URI naming scheme 3.11.2.3

urlify an identifier 3.11.4.2

use case 3.13.15

utterance 3.7.2.2

valence 3.6.7

valency 3.6.7

validity 3.3.21

value 3.3.1

value restriction 3.3.1.3

value scheme 3.12.2.8

value scheme 3.12.3.17

version 3.11.1.2

vertex 3.5.5.1

voxeme 3.7.8.2

voxicon 3.7.8.1

WBS 3.18.2.5

web client 3.11.3.5

well-formedness 3.3.20

word 3.1.9.1

word class 3.1.23

word compound 3.1.9.1.1.1

word form 3.1.13

word lattice 3.4.9

word segmentation 3.14.1

word segmentation unit 3.14.2

word sense 3.1.12

word structure 3.1.16

word-formation 3.1.6.2

work breakdown structure 3.18.2.5

work package 3.18.2.9

work package leader 3.18.2.10

work package team leader 3.18.2.10

working language 3.10.4

WSU 3.14.2

XML 3.12.4.1

XML attribute 3.12.4.4

XML attribute declaration 3.12.4.7

XML container element 3.12.4.3.1

XML document 3.12.4.2

XML element 3.12.4.3

XML element declaration 3.12.4.8

XML namespace 3.12.4.5

XML Schema 3.12.4.6

XML schema datatype 3.12.4.9

XML Schema document 3.12.4.6

1) GoogleDoc is an example of a suitable product available commercially. This information is given for the convenience of users of this document and does not constitute an endorsement by ISO of this product. ↑

Table of Contents