ISO/IEC DIS 23001-19
ISO/IEC DIS 23001-19
ISO/IEC DIS 23001-19: Information technology — MPEG systems technologies — Part 19: Carriage of green metadata

ISO/IEC DIS 23001-19:2025(en)

ISO/IEC JTC 1/SC 29/WG 3

Date: 2025-08-12

Secretariat: JISC

Information technology — MPEG systems technologies — Part 19: Carriage of green metadata

© ISO/IEC 2025

All rights reserved. Unless otherwise specified, or required in the context of its implementation, no part of this publication may be reproduced or utilized otherwise in any form or by any means, electronic or mechanical, including photocopying, or posting on the internet or an intranet, without prior written permission. Permission can be requested from either ISO at the address below or ISO’s member body in the country of the requester.

ISO copyright office

CP 401 • Ch. de Blandonnet 8

CH-1214 Vernier, Geneva

Phone: +41 22 749 01 11

Email: copyright@iso.org

Website: www.iso.org

Published in Switzerland

Contents

Foreword v

Introduction vi

1 Scope 1

2 Normative references 1

3 Terms, definitions, and abbreviated terms 1

3.1 Terms and definitions 1

3.2 Abbreviated terms 2

4 Overview 2

4.1 Overall architecture for carriage of green metadata 2

4.2 Referenceable code points 2

4.2.1 Uniform resource names 2

4.2.2 Restricted scheme types 2

4.2.3 Sample entry types 2

4.2.4 Track reference types 3

4.2.5 Box types 3

5 Carriage of green metadata in ISO Base Media File Format 4

5.1 General 4

5.2 Decoder-power indication metadata 4

5.2.1 Definition 4

5.2.2 Syntax 4

5.2.3 Semantics 4

5.3 Display-power reduction metadata 4

5.3.1 Display power indication metadata 5

5.3.2 Display fine control metadata 5

5.3.3 Display attenuation map metadata 6

6 Encapsulation and signalling in MPEG-DASH 12

6.1 General 12

6.2 Decoder-power indication 12

6.2.1 Metadata signalling in the MPD manifest file 12

6.3 Display-power indication 13

6.3.1 Metadata signalling in the MPD manifest file 13

6.4 Display attenuation map information 14

6.4.1 Metadata signalling in the MPD manifest file 14

Annex A (normative) Green metadata MPEG-DASH schema 19

Annex B (informative) MPEG-DASH MPD examples 21

B.1 Example MPD with decoder power indication metadata 21

B.2 Example MPD with display power indication metadata 22

B.3 Examples of MPD with display attenuation map metadata 23

B.3.1 Example 1 23

B.3.2 Example 2 24

B.3.3 Example 3 26

B.3.4 Example 4 27

Annex C (informative) Conformance and reference software 30

C.1 Display power reduction using display adaptation 30

C.1.1 Conformance test vectors 30

C.1.2 Reference software 30

C.2 Energy-efficient media selection 30

C.2.1 Conformance test vectors 30

C.2.2 Reference software 31

Annex D (informative) Generation and use of green metadata 32

D.1 Decoder and display power indication 32

D.1.1 Metadata generation at the server side 32

D.1.2 Use of metadata at the client 33

D.2 Display attenuation maps 36

D.2.1 Metadata generation at the server side 36

D.2.2 Use of metadata at the client 37

Bibliography 38

Foreword

ISO (the International Organization for Standardization) is a worldwide federation of national standards bodies (ISO member bodies). The work of preparing International Standards is normally carried out through ISO technical committees. Each member body interested in a subject for which a technical committee has been established has the right to be represented on that committee. International organizations, governmental and non-governmental, in liaison with ISO, also take part in the work. ISO collaborates closely with the International Electrotechnical Commission (IEC) on all matters of electrotechnical standardization.

The procedures used to develop this document and those intended for its further maintenance are described in the ISO/IEC Directives, Part 1. In particular, the different approval criteria needed for the different types of ISO documents should be noted. This document was drafted in accordance with the editorial rules of the ISO/IEC Directives, Part 2 (see www.iso.org/directives).

ISO draws attention to the possibility that the implementation of this document may involve the use of (a) patent(s). ISO takes no position concerning the evidence, validity or applicability of any claimed patent rights in respect thereof. As of the date of publication of this document, ISO [had/had not] received notice of (a) patent(s) which may be required to implement this document. However, implementers are cautioned that this may not represent the latest information, which may be obtained from the patent database available at www.iso.org/patents. ISO shall not be held responsible for identifying any or all such patent rights.

Any trade name used in this document is information given for the convenience of users and does not constitute an endorsement.

For an explanation of the voluntary nature of standards, the meaning of ISO specific terms and expressions related to conformity assessment, as well as information about ISO's adherence to the World Trade Organization (WTO) principles in the Technical Barriers to Trade (TBT), see www.iso.org/iso/foreword.html.

This document was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia, and hypermedia information.

A list of all parts in the ISO/IEC 23001 series can be found on the ISO website.

Any feedback or questions on this document should be directed to the user’s national standards body. A complete listing of these bodies can be found at www.iso.org/members.html.

Introduction

Green metadata is used for representing information that will help to reduce the energy consumption in the video distribution chain, towards a more sustainable industry.

This document addresses technologies defining the carriage of green metadata for storage and delivery purposes. This document includes (but is not limited to):

— Storage and carriage of green metadata using the ISO Base Media File Format (ISOBMFF) as specified in ISO/IEC 14496-12;

— Encapsulation, signalling, and streaming of green metadata data in a media streaming system, e.g., dynamic adaptive streaming over HTTP (DASH) as specified in ISO/IEC 23009-1.

Information technology — MPEG systems technologies — Part 19: Carriage of green metadata

1.0 Scope

This part of ISO/IEC 23001 defines a storage and delivery format for green metadata as defined in part ISO/IEC 23001-11. The green metadata are timed metadata which can be associated with other tracks in the ISO Base Media File Format. Timed metadata such as power consumption information and their metrics are defined in this part for carriage in files based on the ISO Base Media File Format (ISO/IEC 14496-12).

In the context of DASH delivery, the green metadata representations and their association to the media representations are also defined using the signalling mechanisms specified in ISO/IEC 23009‑1:2022 and ISO/IEC 23009‑3[1].

These metadata can be used for multiple purposes, including optimizing power consumption during playback and supporting dynamic adaptive streaming.

2.0 Normative references

The following documents are referred to in the text in such a way that some or all of their content constitutes requirements of this document. For dated references, only the edition cited applies. For undated references, the latest edition of the referenced document (including any amendments) applies.

ISO/IEC 14496‑12, Information technology — Coding of audio-visual objects — Part 12: ISO base media file format

ISO/IEC 23001‑11:2023, Information technology — MPEG systems technologies — Part 11: Energy-efficient media consumption (green metadata)

ISO/IEC 23009‑1:2022, Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats

ISO/IEC 23001‑10, Information technology — MPEG systems technologies — Part 10: Carriage of timed metadata metrics of media in ISO base media file format

ISO/IEC 14496‑5, Information technology — Coding of audio-visual objects — Part 5: Reference software | Rec. ITU-T H.264.2: Reference software for ITU-T H.264 advanced video coding

ISO/IEC 14496‑15:2022, Information technology — Coding of audio-visual objects — Part 15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format

3.0 Terms, definitions, and abbreviated terms

3.1 Terms and definitions

For the purposes of this document, the terms and definitions given in ISO/IEC 23001-11 apply.

ISO and IEC maintain terminology databases for use in standardization at the following addresses:

— ISO Online browsing platform: available at https://www.iso.org/obp

— IEC Electropedia: available at https://www.electropedia.org/

3.1.1 Abbreviated terms

DASH

dynamic adaptive streaming over HTTP (specified in ISO/IEC 23009-1)

ISOBMFF

ISO base media file format (specified in ISO/IEC 14496-12)

 

 

4.0 Overview

4.1 Overall architecture for carriage of green metadata

[Editor’s Note: To be completed.]

4.1.1 Referenceable code points

4.1.2 Uniform resource names

The URNs specified in this document are listed in Table 1.

Table 1 — URNs specified in this document.

URN

Clause

Informative description

urn:mpeg:mpegI:green:2025

6.4.1.1

Namespace for the XML elements and attributes specified in this document

urn:mpeg:mpegI:green:2025:ami

6.4.1.1

Scheme identifier for the display attenuation map DASH MPD descriptor

urn:mpeg:mpegI:green:role:2025

TBD

Scheme identifier for a DASH MPD role descriptor for green metadata.

4.1.3 Restricted scheme types

The restricted scheme types specified in this document are listed in Table 2.

Table 2 — Restricted scheme types specified in this document.

Restricted scheme type

Clause

Informative description

gmat

5.3.3.2

Restricted scheme type for a display attenuation map track

4.1.4 Sample entry types

The sample entry types specified in this document are listed in Table 3.

Table 3 — Sample entry types specified in this document.

Sample entry type

Clause

Informative description

depi

5.2.1

Sample entry for a track carrying Decoder-Power Indication metadata

dipi

5.3.1.1

Sample entry for a track carrying Display-Power Indication metadata

dfce

5.3.2.1

Sample entry for a track carrying Display Fine Control metadata

4.1.5 Track reference types

The track reference types specified in this document are listed in Table 4.

Table 4 — Track reference types specified in this document.

Track reference type

Clause

Informative description

gmam

5.3.3.2.1

Referenced track is a video track to which the display attenuation map applies.

4.1.6 Box types

The box types specified in this document are listed in Table 5. In the table, the box types specified in ISO/IEC 23001-19 are in black text with links to the corresponding clauses in the specification. Related container boxes specified in ISOBMFF are marked in grey. Non-related ISOBMFF boxes are not included in the table. Mandatory boxes are, as in ISOBMFF, marked with an asterisk. Box types without a four character code are marked with ‘-‘ in the structure.

Table 5 — Box types specified in this document.

Box types, structure, and cross-reference (Informative)

moov

 

 

 

 

 

 

 

 

 

*

ISOBMFF

container for all the metadata

 

trak

 

 

 

 

 

 

 

 

*

ISOBMFF

container for an individual track or stream

 

 

mdia

 

 

 

 

 

 

 

*

ISOBMFF

container for the media information in a track

 

 

 

minf

 

 

 

 

 

 

*

ISOBMFF

media information container

 

 

 

 

stbl

 

 

 

 

 

*

ISOBMFF

sample table box, container for the time/space map

 

 

 

 

 

stsd

 

 

 

 

*

ISOBMFF

sample descriptions (codec types, initialization etc.)

 

 

 

 

 

 

-

 

 

 

 

ISOBMFF

sample entry or restricted sample entry

 

 

 

 

 

 

 

rinf

 

 

 

ISOBMFF

restricted scheme info box

 

 

 

 

 

 

 

 

frma

 

 

ISOBMFF

original format box

 

 

 

 

 

 

 

 

schm

 

 

ISOBMFF

scheme type box

 

 

 

 

 

 

 

 

schi

 

 

ISOBMFF

scheme information box

 

 

 

 

 

 

 

 

 

amid

 

5.3.3.1

attenuation map information box

 

 

 

 

 

 

depi

 

 

 

 

5.2.1

decoder power indication metadata sample entry

 

 

 

 

 

 

dipi

 

 

 

 

5.3.1

display power indication metadata sample entry

 

 

 

 

 

 

dfce

 

 

 

 

5.3.2.1

display fine control metadata sample entry

 

 

 

 

 

 

 

dfcC

 

 

 

5.3.2.1

display fine control configuration box

 

 

 

 

 

 

 

 

 

 

 

 

 

5.0 Carriage of green metadata in ISO Base Media File Format

5.1 General

If green metadata is carried in an ISO Base Media File Format, it shall be carried in the metadata tracks within the ISO Base Media File Format. Different green metadata types and corresponding storage formats are identified by their unique sample entry codes.

A metadata track carrying green metadata is linked to the track it describes by means of a ‘cdsc’ (content describes) track reference.

5.1.1 Decoder-power indication metadata

5.1.2 Definition

Sample Entry Type:

‘depi’

Container:

Sample Description Box (‘stsd’)

Mandatory:

No

Quantity:

0 or 1

The Decoder-Power Indication metadata is defined in ISO/IEC 23001-11. It provides decoder complexity reduction ratios for the media track to which the metadata track refers by means of ‘cdsc’ reference.

5.1.3 Syntax

The Decoder-Power Indication metadata sample entry shall be as follows.

class DecoderPowerIndicationMetaDataSampleEntry()

   extends MetaDataSampleEntry (‘depi‘) {

 

}

 

The Decoder-Power Indication metadata sample shall conform to the following syntax:

aligned(8) class DecoderPowerIndicationMetaDataSample(){

   unsigned int(8) Dec_ops_reduction_ratio_from_max;

   signed int(16) Dec_ops_reduction_ratio_from_prev;

}

5.1.4 Semantics

Semantics are defined in ISO/IEC 23001-11.

5.2 Display-power reduction metadata

The Display-Power Reduction metadata is defined in ISO/IEC 23001-11. Display-Power Reduction metadata provides frame statistics and quality indicators for the media track that the metadata track refers to by means of ‘cdsc’ reference. These metadata allow the client to attain a specified quality level by scaling frame-buffer pixels and to reduce power correspondingly by decreasing the display backlight or OLED voltage.

Display-Power Reduction metadata is of three types:

1. metadata that indicates power saving at different quality levels over the sample duration. This metadata shall use the 'dipi’ (Display Power Indication) sample entry type.

2. metadata that allows fine control of the display to achieve power reduction at a specified quality level. This metadata shall use the ’dfce’ (Display Fine Control) sample entry type.

3. metadata that conveys pixel-wise information that can be applied to frames in the original video sequence to reduce the total energy consumption resulting from rendering the frames of that video. This metadata shall use the ‘amid’ (Display Attenuation Map) sample entry type.

Static metadata for the display fine control is stored in the sample entry. Dynamic metadata is stored in the samples.

5.2.1 Display power indication metadata

Definition

Sample Entry Type:

‘dipi’

Container:

Sample Description Box (‘stsd’)

Mandatory:

No

Quantity:

0 or 1

This metadata indicates potential power savings at different quality levels over the sample duration.

Syntax

Display Power Indication metadata shall use the following sample entry:

aligned(8) class DisplayPowerIndicationMetaDataSampleEntry() extends

MetaDataSampleEntry (‘dipi‘) {

}

 

The Display Power Indication metadata sample shall use the following syntax:

class QualityLevels (num_quality_levels) {

   unsigned int(8) rgb_component_for_infinite_psnr;

   for (i = 1; i <= num_quality_levels; i++) {

      unsigned int(8) max_rgb_component;

      unsigned int(8) scaled_psnr_rgb;

   }

}

 

aligned class DisplayPowerIndicationMetaDataSample () {

   unsigned int(4) num_quality_levels;

   unsigned int(4) reserved=0;}

   QualityLevels(num_quality_levels)

}

 

Please note that the PSNR variables appearing in the syntax presented above are as defined in ISO/IEC 23001-11 and should not be confused with the PSNR metric defined in clause 4.2 of ISO/IEC 23001-10.

Semantics

Semantics are defined in ISO/IEC 23001-11.

5.2.2 Display fine control metadata

Definition

Sample Entry Type:

‘dfce’

Container:

Sample Description Box (‘stsd’)

Mandatory:

No

Quantity:

0 or 1

The dynamic Display Fine Control metadata is stored in the samples and is associated with one or more video frames.

The Decoding Time to Sample box provides the decoding time for the sample so that the metadata contained therein is made available to the display with sufficient lead time relative to the video composition time. Note that the video composition time and metadata composition time are identical. The lead time is required because display settings must be adjusted in advance of presentation time for correct operation. If num_constant_backlight_voltage_time_intervals > 1, then the lead time should be larger than the largest constant_backlight_voltage_time_interval.

Syntax

The Display Fine Control metadata sample entry shall store static metadata as follows.

class DisplayFineControlMetaDataSampleEntry()

   extends MetaDataSampleEntry (‘dfce‘) {

DisplayFineControlConfigurationBox();

}

 

aligned(8) class DisplayFineControlConfigurationBox

   extends FullBox(‘dfcC’, version = 0, flags = 0) {

   unsigned int(2) num_constant_backlight_voltage_time_intervals;

   unsigned int(6) reserved = 0;

   unsigned int(16)constant_backlight_voltage_time_interval[

                     num_constant_backlight_voltage_time_intervals ];

   unsigned int(2) num_max_variations;

   unsigned int(6) reserved = 0;

   unsigned int(16) max_variation[ num_max_variations ];

}

 

The Display-Fine Control metadata sample shall use the following syntax:

class QualityLevels (num_quality_levels) {

   unsigned int(8) rgb_component_for_infinite_psnr;

   for (i = 1; i <= num_quality_levels; i++) {

      unsigned int(8) max_rgb_component;

      unsigned int(8) scaled_psnr_rgb;

   }

}

 

class MetadataSet (num_quality_levels) {

   unsigned int(8) lower_bound;

   if (lower_bound > 0)

      unsigned int(8) upper_bound;

   QualityLevels(num_quality_levels);

}

 

class DisplayPowerReductionMetaDataSample {

   unsigned int(4) num_quality_levels;

   unsigned int(4) reserved = 0;

 

   for (k=0; k<num_constant_backlight_voltage_time_intervals; k++)

      for (j = 0; j < num_max_variations; j++)

         MetadataSet(num_quality_levels);

}

Semantics

Semantics are defined in ISO/IEC 23001-11.

5.2.3 Display attenuation map metadata

General

A display attenuation map video is a 2D video where each frame conveys pixel-wise information that can be applied to a corresponding frame in the original video sequence through a processing operation to reduce the total energy consumption resulting from rendering the frames of that video.

Attenuation map information box

Definition

An AttenuationMapInformationBox contains information about the characteristics of the display attenuation map data stream carried by the track in which it is signalled. It is identified by the 4CC ‘amid’. The information in the payload of the track may include pre-processing operations that should be applied to the samples of the display attenuation map as well as other information that may distinguish one display attenuation map track from another for the same content, which enables a player to select the most suitable track based on its energy reduction strategy.

Syntax

aligned(8) class AMIApproximationModel() {

   bit(4) reserved = 0;

   unsigned int(4) ami_map_approx_model;

}

 

aligned(8) class AMIWindowInfo() {

   unsigned int(8) ami_window_x;

   unsigned int(8) ami_window_y;

   unsigned int(8) ami_window_width;

   unsigned int(8) ami_window_height;

}

 

aligned(8) class AMIVideoQualityInfo() {

   bit(5) reserved = 0;

   unsigned int(3) ami_quality_metric;

   unsigned int(8) ami_quality_reduction;

}

 

aligned(8) class AMIPreprocessingInfo() {

   bit(6) reserved = 0;

   unsigned int(2) ami_preprocessing_type;

   unsigned int(8) ami_max_value;

   unsigned int(8) ami_preprocessing_scale;

}

 

aligned(8) class AttenuationMapInformationBox() extends FullBox('amid', version = 0, flags) {

 

   bit(3) reserved = 0;

   unsigned int(1) ami_preprocessing_info_present_flag;

   unsigned int(1) ami_approx_model_present_flag;

   unsigned int(1) ami_window_info_present_flag;

   unsigned int(1) ami_video_quality_info_present_flag;

 

   unsigned int(5) ami_energy_reduction_rate;

   unsigned int(4) ami_display_model;

   unsigned int(4) ami_attenuation_use_idc;

   unsigned int(4) ami_attenuation_component_idc;

   

   if (ami_preprocessing_info_present_flag) {

      AMIPreprocessingInfo();

   }

 

   if (ami_approx_model_present_flag) {

      AMIApproximationModel();

   }

 

   if (ami_window_info_present_flag) {

      AMIWindowInfo();

   }

 

   if (ami_video_quality_info_present_flag) {

      AMIVideoQualityInfo();

   }

 

}

Semantics

The semantics of the fields defined in AttenuationMapInformationBox are as follows:

ami_preprocessing_info_present_flag is a flag indicating whether preprocessing information is present. Value 1 indicates that the box contains preprocessing information.

ami_approx_model_present_flag is a flag indicating whether approximation model information is present. Value 1 indicates that the box contains approximation model information.

ami_window_info_present_flag is a flag indicating whether window information is present. Value 1 indicates that the box contains window information.

ami_video_quality_info_present_flag is a flag indicating whether video quality information is present. Value 1 indicates that the box contains video quality information.

ami_energy_reduction_rate indicates the expected energy saving rate (percentage) when the associated video is displayed after applying the display attenuation map sample values on the sample values of the associated video.

ami_display_model indicates the display models on which the display attenuation map sample values may be used. The semantics of the bits of this field are described in Table 6.

Table 6 — Semantics of the bits of the ami_display_model field.

Bit number

Display model

0

Transmissive pixel

1

Emissive pixel

2 .. 3

Reserved for future use

ami_attenuation_use_idc indicates which operation shall be used to apply the display attenuation map sample values to the corresponding frame in the associated video before rendering the frame on the display. The semantics of the values assigned to this field are described in Table 7.

Table 7 — Semantics of the values assigned to ami_attenuation_use_idc.

Value

Description

0

The display attenuation map sample values shall be added to the associated video frame sample values.

1

The display attenuation map sample values shall be subtracted from the associated video frame sample values.

2

The associated video frame sample values shall be multiplied by the display attenuation map sample values.

3

The display attenuation map sample values shall be applied to the associated video frame according to a proprietary user-defined process.

ami_attenuation_component_idc indicates on which color component(s) of the associated video the display attenuation map shall be applied using the operation defined by ami_attenuation_use_idc. It also specifies the number of components of the display attenuation map. The semantics of the values assigned to this field are described in Table 8.

Table 8 — Semantics of the values assigned to ami_attenuation_component_idc.

Value

Description

0

The display attenuation map contains only one component, and this component shall be applied to the luma component of the associated video.

1

The display attenuation map contains two components, and the first component shall be applied to the luma component of the associated video, and the second component should be applied to both chroma components of the associated video.

2

The display attenuation map contains only one component, and this component shall be applied to the luma component and the chroma components of the associated video.

3

The display attenuation map contains only one component, and this component shall be applied to the RGB components (after YUV to RGB conversion) of the associated video.

4

The display attenuation map contains three components, and these components shall be applied respectively to the luma and chroma components of the associated video.

5

The display attenuation map contains three components, and these components shall be applied, respectively, to the RGB components (after YUV to RGB conversion) of the associated video.

6

The mapping between the components of the display attenuation map and the components of the associated video to which the display attenuation map is applied is based on some proprietary user-defined process.

7 .. 15

Reserved for future use

Key:

YUV = colour space with luma (Y) and chroma (U,V) components

ami_preprocessing_type indicates which type of pre-processing interpolation model should be used to re-sample the display attenuation map sample values at the same resolution as the associated video before applying it to the associated video frame. The semantics of the values assigned to this field are described in Table 9.

Table 9 — Semantics of the values assigned to ami_preprocessing_type.

Value

Description

0

Bicubic interpolation.

1

Bilinear interpolation.

2

Lanczos interpolation.

3

User-defined interpolation process.

ami_max_value indicates the maximum value of the display attenuation map. This value may be used to further adjust the dynamic range of the encoded display attenuation map in the scaling process.

ami_preprocessing_scale indicates which scaling shall be applied to obtain the display attenuation map sample values before applying them on the sample values of the associated video. The semantics of the values assigned to this field are described in Table 10.

Table 10— Semantics of the values assigned to ami_preprocessing_scale.

Value

Description

0

A scaling of 1/255 shall be applied.

1

User-defined scaling operation.

2..7

Reserved for future use.

ami_map_approx_model specifies which model should be used to extrapolate the display attenuation map with individual energy reduction rate to another set of display attenuation map with a different energy reduction rate. The semantics of the values assigned to this field are described in Table 11.

Table 11 — Semantics of the values assigned to ami_map_approximation_model.

Value

Description

0

Extrapolation to another target energy reduction rate should be applied through a linear scaling of the display attenuation map sample values given its ami_energy_reduction_rate value and the target energy reduction rate.

1

User-defined extrapolation process.

2..3

Reserved for future use.

ami_window_x indicates the x-coordinate of the top-left corner of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

ami_window_y indicates the y-coordinate of the top-left corner of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

ami_window_width indicates the width, in number of pixels, of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

ami_window_height indicates the height, in number of pixels, of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

ami_quality_metric indicates the type of the objective quality metric used for the measured quality reduction value resulting from applying the display attenuation map to the associated video and assigned to the ami_quality_reduction field. The semantics of the values assigned to this field are described in Table 12.

Table 12 — Semantics of the values assigned to ami_quality_metric.

Value

Quality metric

0

PSNR

1

SSIM

2

wPSNR

3

WS-PSNR

4

User-defined

ami_quality_reduction specifies the percentage of quality reduction that can be expected in the associated video as a result of applying the display attenuation map to it.

Display attenuation map tracks

General

A display attenuation map track is a restricted video track with the sample entry type 'resv'. The original sample entry type, which is based on the video codec used for encoding the stream, is stored within the OriginalFormatBox in the RestrictedSchemeInfoBox. The scheme_type field in SchemeTypeBox shall be set to 'gmat', indicating a display attenuation map restricted scheme. The SchemeInformationBox shall include an AttenuationMapInformationBox, as defined in subclause 5.3.3.2. In the track header, the track_in_movie flag shall be set to 0 to indicate that this track should not be presented alone.

Association with video tracks

A TrackReferenceTypeBox with the reference type 'gmam' shall be added to a TrackReferenceBox within the TrackBox of the track carrying the display attenuation map data. The TrackReferenceTypeBox shall contain an array of track_IDs designating the identifiers for the referenced video tracks.

Sample format

Each sample in a display attenuation map track carries a sequence of video NAL units corresponding to the encoded display attenuation map for a single access unit (AU) in the associated video track(s) and shall be encapsulated based on the sample formats defined in ISO/IEC 14496-15:2022.

Signalling alternative attenuation map tracks

Multiple display attenuation map tracks may be present in an ISOBMFF file. When more than one version of a display attenuation map is available for the same video track in the ISOBMFF container (e.g., different energy consumption levels or different video qualities), each version is carried in a separate display attenuation map track.

Display attenuation map tracks that are alternatives of each other shall be signalled as alternatives of each other by either setting the alternate_group field in their respective TrackHeaderBox(es) to the same value or grouping the tracks together with an EntityToGroupBox with grouping_type equal to 'altr', indicating that the display attenuation map tracks which are mapped to this grouping are alternatives to each other, and only one of them should be processed.

6.0 Encapsulation and signalling in MPEG-DASH

6.1 General

This clause explains how the green metadata can be computed at the server for adaptive streaming scenarios and how such metadata can be used at the client.

In the context of DASH delivery, a specific adaptation set within the MPD can define the available green metadata representations and their association to the available media representations, using the signalling mechanisms specified in ISO/IEC 23009‑1:2022 and ISO/IEC 23009‑3[1].

6.1.1 Decoder-power indication

6.1.2 Metadata signalling in the MPD manifest file

In the DASH context, the metadata files created for one or multiple video Representations are considered as metadata Representations. The available metadata Representations are signalled in a specific Adaptation Set within the MPD. The association of a metadata Representation with a media Representation is signalled in the MPD through the @associationId and @associationType attributes. A metadata Segment and its associated media Segment(s) are time aligned on Segment boundaries.

The Decoder-Power Indication metadata Representation is associated with a single media Representation as shown in Figure 6.2.

Figure 6.2 — One metadata representation for one media representation.

6.2 Display-power indication

6.2.1 Metadata signalling in the MPD manifest file

The Display-Power Indication metadata Representation is associated with all the available media Representations as shown in Figure 6.3.

Figure 6.3 One metadata representation for all media representations.

6.3 Display attenuation map information

6.3.1 Metadata signalling in the MPD manifest file

Display attenuation map Representations for a video are signalled in their own Adaptation Set within the MPD file. This Adaptation Set is henceforth referred to as a Display Attenuation Map Adaptation Set. The @codecs attribute for a Display Attenuation Map Adaptation Set, or Representations of this Adaptation Set if @codecs is not signalled for the AdaptationSet element, is set based on the respective codec used for encoding the display attenuation map. The value of @codecs shall be set to 'resv.gmat.XXXX', where XXXX corresponds to the four-character code (4CC) of the video codec and shall be identical to the original_format field in the RestrictedSchemeInfoBox of the sample entry of the corresponding ISOBMFF track, clause 5.3.3.

The association of a display attenuation map Representation element with one or more video Representation(s) is signalled in the MPD through the @associationId and @associationType attributes. The @associationId attribute is set to a space separated list that includes the @id attribute values of the associated video Representations. The @associationType attribute shall be set to “amit”, indicating that this association is for display attenuation map information.

Multiple display attenuation maps with different characteristics (e.g., different values of ami_energy_reduction_rate, different values of ami_display_model, different resolutions, etc.) can be generated for the same video Representation and each alternative display attenuation map is represented by one Representation in the Display Attenuation Map Adaptation Set. As specified in subclause 5.3.3.3.4, these display attenuation map Representations that are alternatives of each other shall be indicated by the same value of alternate_group in the TrackHeaderBox.

A number of Display Attenuation Map Adaptation Sets may be associated with the same video. For example, each Display Attenuation Map Adaptation Set may be related to a particular region (window) within the video frames. When more than one Display Attenuation Map Adaptation Set is available for the same video content, the Display Attenuation Map Adaptation Sets may be grouped with the associated video Adaptation Set using a Green Video Preselection to provide different experiences. For example, each preselection may be tailored to certain lighting condition (e.g., indoors vs. outdoors).

A Green Video Preselection is signaled using a Preselection element, as defined in ISO/IEC 23009-1, with an IDs list for the @preselectionComponents attribute which includes the identifier of the Main Adaptation Set for the media (i.e., the video Adaptation Set) followed by the identifiers of the associated Display Attenuation Map Adaptation Sets. The @codecs attribute for the Preselection is set based on the codec used by the Main Adaptation Set (i.e., the video Adaptation Set).

Descriptors

To signal the characteristics of the display attenuation map(s) carried by a Display Attenuation Map Adaptation Set, an AttenuationMap descriptor is defined. The XML elements and attributes for this descriptor are defined in a separate namespace "urn:mpeg:mpegI:green:2025”. The namespace designator “green:” is used to refer to this name space.

This descriptor is an EssentialProperty descriptor with the @schemeIdUri attribute set to the unique URI "urn:mpeg:mpegI:green:2025:ami".

At most one single AttenuationMap descriptor shall be present at the Adaptation Set level and/or at Representation level for each Representation in a Display Attenuation Map Adaptation Set.

The @value attribute of the AttenuationMap descriptor shall not be present. Table 13 lists all the elements and attribute of the AttenuationMap descriptor.

Table 13 — Elements and attributes of the AttenuationMap descriptor.

Elements and attributes

Use

Data type

Description

AMI

M

green:AMInfoType

An element whose attributes and sub-elements specify information for the display attenuation map present in the Representation(s) of the Adaptation Set.

AMI@displayModel

O

green:DisplayModelType

Indicates the display models on which the display attenuation map may be used.

For ISO Base Media File Format Segments, the value of the @displayModel attribute shall be equal to ami_display_model in the AttenuationMapInformationBox in sample entries of the Initialization Segment.

AMI@attenuationUse

O

green:FourValueRangeType

Indicates which operation shall be used to apply the display attenuation map sample values to the corresponding frame in the associated video before rendering the frame on the display.

For ISO Base Media File Format Segments, the value of the @attenuationUse attribute shall be equal to ami_attenuation_use_idc in the AttenuationMapInformationBox in sample entries of the Initialization Segment.

AMI@attenuationComponentIdc

O

green:AttenuationComponentIdcType

Indicates on which colour component(s) of the associated video the display attenuation map shall be applied using the operation defined by @attenuationUse. It also specifies how many components the display attenuation map has.

For ISO Base Media File Format Segments, the value of the @attenuationComponentIdc attribute shall be equal to ami_attenuation_component_idc in the AttenuationMapInformationBox in sample entries of the Initialization Segment.

AMI@energyReductionRate

O

green:EnergyReductionRateType

Indicates the expected energy saving rate (percentage) when the video is displayed after applying the display attenuation map sample values on the sample values of the associated video.

For ISO Base Media File Format Segments, the value of the @energyReductionRate attribute shall be equal to ami_energy_reduction_rate in the AttenuationMapInformationBox in sample entries of the Initialization Segment.

AMI.QualityInfo

O

green:QualityInfoType

An element whose attributes specify the quality reduction rate when the display attenuation map is applied to the associated video representation.

AMI.QualityInfo@metric

M

xs:string

Indicates the video quality metric used to assess the quality reduction.

AMI.QualityInfo@reduction

M

xs:unsignedByte

Indicates the percentage of quality reduction expected in the associated video as a result of applying the display attenuation map to it.

AMI.PreprocessingInfo

O

green:PreprocessingInfoType

An element whose attributes and sub-elements specify information for the pre-processing interpolation model that should be used.

AMI.PreprocessingInfo@type

M

green:FourValueRangeType

Indicates which type of pre-processing interpolation model should be used to re-sample the display attenuation map sample values at the same resolution as the associated video before applying it to the associated video frame.

AMI.PreprocessingInfo@max_value

M

xs:unsignedByte

Indicates the maximum value of the display attenuation map. This value may be optionally used to further adjust the dynamic range of the encoded display attenuation map in the scaling process.

AMI.PreprocessingInfo@scale

M

xs:unsignedByte

Indicates the scaling that shall be applied to obtain the display attenuation map sample values before applying them on the sample values of the associated video.

AMI.ApproxModel

O

green:ApproxModelType

An element whose attributes specify the model that should be used to extrapolate the display attenuation map with individual energy reduction rate to another set of display attenuation maps with different energy reduction rates.

AMI.ApproxModel@type

M

green:FourValueRangeType

Indicates the type of the model that should be used to extrapolate the display attenuation map with a certain energy reduction rate to another set of display attenuation maps with different energy reduction rates.

AMI.WindowInfo

O

green:WindowInfoType

An element whose attributes specify a bounding window defining a region of the associated video to which the display attenuation map shall be applied.

AMI.WindowInfo@x

M

xs:unsignedByte

Indicates the x-coordinate of the top-left corner of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

AMI.WindowInfo@y

M

xs:unsignedByte

Indicates the y-coordinate of the top-left corner of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

AMI.WindowInfo@width

M

xs:unsignedByte

Indicates the width, in number of pixels, of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

AMI.WindowInfo@height

M

xs:unsignedByte

Indicates the height, in number of pixels, of the bounding window defining a region of the associated video to which the display attenuation map carried by the display attenuation map track shall be applied.

Key:

       For attributes: M=Mandatory, O=Optional, OD=Optional with Default Value, CM=Conditionally Mandatory.

       For elements: <minOccurs>..<maxOccurs> (N=unbounded)

Elements are bold; attributes are non-bold and preceded with an @.


  1. (normative)

    Green metadata MPEG-DASH schema

<?xml version="1.0" encoding="UTF-8"?>

<xs:schema xmlns:xs="https://www.w3.org/2001/XMLSchema"

   targetNamespace="urn:mpeg:mpegI:green:2025"

   xmlns:green="urn:mpeg:mpegI:green:2025"

   elementFormDefault="qualified">

 

   <xs:element name="AMI" type="green:AMInfoType" />

 

   <xs:complexType name="AMInfoType">

      <xs:sequence>

         <xs:element name="QualityInfo" type="green:QualityInfoType" minOccurs="0"

            maxOccurs="1"/>

         <xs:element name="PreprocessingInfo" type="green:PreprocessingInfoType"

            minOccurs="0" maxOccurs="1" />

         <xs:element name="WindowInfo" type="green:WindowInfoType" minOccurs="0"

            maxOccurs="1" />

         <xs:element name="ApproxModel" type="green:ApproxModelType" minOccurs="0"

            maxOccurs="1" />

         <xs:any namespace="##other" processContents="lax" minOccurs="0"

            maxOccurs="unbounded" />

      </xs:sequence>

 

      <xs:attribute name="displayModel" type="green:DisplayModelType" />

      <xs:attribute name="attenuationUse" type="green:FourValueRangeType" />

      <xs:attribute name="attenuationComponentIdc"

         type="green:AttenuationComponentIdcType" />

      <xs:attribute name="energyReductionRate"

         type="green:EnergyReductionRateType" />

      <xs:anyAttribute namespace="##other" processContents="lax" />

   </xs:complexType>

 

   <xs:complexType name="QualityInfoType">

      <xs:attribute name="metric" type="xs:string" use="required" />

      <xs:attribute name="reduction" type="xs:unsignedByte" use="required" />

      <xs:anyAttribute namespace="##other" processContents="lax" />

   </xs:complexType>

 

   <xs:complexType name="PreprocessingInfoType">

      <xs:attribute name="type" type="green:FourValueRangeType" use="required" />

      <xs:attribute name="max_value" type="xs:unsignedByte" use="required" />

      <xs:attribute name="scale" type="xs:unsignedByte" use="required" />

      <xs:anyAttribute namespace="##other" processContents="lax" />

   </xs:complexType>

 

   <xs:complexType name="ApproxModelType">

      <xs:attribute name="type" type="green:FourValueRangeType" use="required" />

      <xs:anyAttribute namespace="##other" processContents="lax" />

   </xs:complexType>

 

   <xs:complexType name="WindowInfoType">

      <xs:attribute name="x" type="xs:unsignedByte" use="required" />

      <xs:attribute name="y" type="xs:unsignedByte" use="required" />

      <xs:attribute name="width" type="xs:unsignedByte" use="required" />

      <xs:attribute name="height" type="xs:unsignedByte" use="required" />

      <xs:anyAttribute namespace="##other" processContents="lax" />

   </xs:complexType>

 

   <xs:simpleType name="DisplayModelType">

      <xs:restriction base="xs:unsignedByte">

         <xs:minInclusive value="0"/>

         <xs:maxInclusive value="3"/>

      </xs:restriction>

   </xs:simpleType>

 

   <xs:simpleType name="AttenuationComponentIdcType">

      <xs:restriction base="xs:unsignedByte">

         <xs:minInclusive value="0"/>

         <xs:maxInclusive value="15"/>

      </xs:restriction>

   </xs:simpleType>

 

   <xs:simpleType name="EnergyReductionRateType">

      <xs:restriction base="xs:unsignedByte">

         <xs:minInclusive value="0"/>

         <xs:maxInclusive value="99"/>

      </xs:restriction>

   </xs:simpleType>

 

   <xs:simpleType name="FourValueRangeType">

      <xs:restriction base="xs:unsignedByte">

         <xs:minInclusive value="0"/>

         <xs:maxInclusive value="3"/>

      </xs:restriction>

   </xs:simpleType>

 

</xs:schema>


  1. (informative)

    MPEG-DASH MPD examples
    1. Example MPD with decoder power indication metadata

The following XML file provides an example of an MPD for Decoder-Power Indication metadata.

<?xml version="1.0" encoding="UTF-8"?>

<MPD

xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

xmlns="urn:mpeg:DASH:schema:MPD:XXXX"

xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

type="dynamic"

minimumUpdatePeriod="PT2S"

timeShiftBufferDepth="PT30M"

availabilityStartTime="2011-12-25T12:30:00"

minBufferTime="PT4S"

profiles="urn:mpeg:dash:profile:isoff-live:2011">

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

<Period>

<!-- Video -->

<AdaptationSet

id="video"

mimeType="video/mp4"

codecs="avc1.4D401F"

frameRate="30000/1001"

segmentAlignment="true"

startWithSAP="1">

<BaseURL>video/</BaseURL>

<SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="v0" width="320" height="240" bandwidth="250000"/>

<Representation id="v1" width="640" height="480" bandwidth="500000"/>

<Representation id="v2" width="960" height="720" bandwidth="1000000"/>

</AdaptationSet>

<!-- English Audio -->

<AdaptationSet mimeType="audio/mp4" codecs="mp4a.0x40" lang="en" segmentAlignment="0">

<SegmentTemplate timescale="48000" media="audio/en/$Time$.mp4a">

<SegmentTimeline>

<S t="0" d="96000" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="a0" bandwidth="64000" />

</AdaptationSet>

<!-- French Audio -->

<AdaptationSet mimeType="audio/mp4" codecs="mp4a.0x40" lang="fr" segmentAlignment="0">

<SegmentTemplate timescale="48000" media="audio/fr/$Time$.mp4a">

<SegmentTimeline>

<S t="0" d="96000" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="a0" bandwidth="64000" />

</AdaptationSet>

<!-- AdaptationSet carrying Green Video Information for Video -->

<AdaptationSet id="green_video" codecs="depi"/>

<BaseURL>video_green_depi/</BaseURL>

<SegmentTemplate timescale="90000" media="$id$/$Time$.mp4m">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="gv0" bandwidth="1000" associationId="v0" associationType="cdsc"/>

<Representation id="gv1" bandwidth="1000" associationId="v1" associationType="cdsc"/>

<Representation id="gv2" bandwidth="1000" associationId="v2" associationType="cdsc"/>

</AdaptationSet>

</Period>

 

</MPD>.

    1. Example MPD with display power indication metadata

The following XML file provides an example of an MPD for Display-Power Indication metadata.

<?xml version="1.0" encoding="UTF-8"?>

<MPD

xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

xmlns="urn:mpeg:DASH:schema:MPD:XXXX"

xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

type="dynamic"

minimumUpdatePeriod="PT2S"

timeShiftBufferDepth="PT30M"

availabilityStartTime="2011-12-25T12:30:00"

minBufferTime="PT4S"

profiles="urn:mpeg:dash:profile:isoff-live:2011">

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

<Period>

<!-- Video -->

<AdaptationSet

id="video"

mimeType="video/mp4"

codecs="avc1.4D401F"

frameRate="30000/1001"

segmentAlignment="true"

startWithSAP="1">

<BaseURL>video/</BaseURL>

<SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="v0" width="320" height="240" bandwidth="250000"/>

<Representation id="v1" width="640" height="480" bandwidth="500000"/>

<Representation id="v2" width="960" height="720" bandwidth="1000000"/>

</AdaptationSet>

<!-- English Audio -->

<AdaptationSet mimeType="audio/mp4" codecs="mp4a.0x40" lang="en" segmentAlignment="0">

<SegmentTemplate timescale="48000" media="audio/en/$Time$.mp4a">

<SegmentTimeline>

<S t="0" d="96000" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="a0" bandwidth="64000" />

</AdaptationSet>

<!-- French Audio -->

<AdaptationSet mimeType="audio/mp4" codecs="mp4a.0x40" lang="fr" segmentAlignment="0">

<SegmentTemplate timescale="48000" media="audio/fr/$Time$.mp4a">

<SegmentTimeline>

<S t="0" d="96000" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="a0" bandwidth="64000" />

</AdaptationSet>

<!-- AdapatationSet carrying Green Video Information for Video -->

<AdaptationSet id="green_video" codecs="dipi"/>

<BaseURL>video_green_dipi/</BaseURL>

<SegmentTemplate timescale="90000" media="$id$/$Time$.mp4m">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="gv0" bandwidth="1000" associationId="v0 v1 v2"

associationType="cdsc"/>

</AdaptationSet>

 

</Period>

 

</MPD>

    1. Examples of MPD with display attenuation map metadata
      1. Example 1

This example demonstrates a DASH MPD with a presentation that includes a video Adaptation Set with three Representations and a Display Attenuation Map Adaptation Set with one Representation that is associated with the first video Representation. The display attenuation map carried by the Display Attenuation Map Adaptation Set results in a 20 % reduction in energy consumption when applied to the associated video Representation. The application of Display Attenuation Map Representation also results in a 5 % reduction of the video quality in terms of PSNR.

<?xml version="1.0" encoding="UTF-8"?>

<MPD

xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

xmlns="urn:mpeg:DASH:schema:MPD:2011"

xmlns:green="urn:mpeg:mpegI:green:2025"

xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

type="static"

minBufferTime="PT4S"

profiles="urn:mpeg:dash:profile:isoff-on-demand:2011">

 

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

 

<Period>

 

<!-- Video -->

<AdaptationSet id="video" mimeType="video/mp4" codecs="avc1.4D401F"

frameRate="30000/1001" segmentAlignment="true" startWithSAP="1">

<BaseURL>video/</BaseURL>

<SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="v0" width="320" height="240" bandwidth="250000"/>

<Representation id="v1" width="640" height="480" bandwidth="500000"/>

<Representation id="v2" width="960" height="720" bandwidth="1000000"/>

</AdaptationSet>

 

<AdaptationSet id="ami" mimeType="video/mp4" codecs="resv.gmat.avc1.4D401F">

 

<BaseURL>attenuation_maps/</BaseURL>

<SegmentTemplate timescale="90000" media="$id$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

 

<Representation id="ami0" bandwidth="1000" associationId="v0"

associationType="amit" />

<EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

<green:AMI displayMode="3" attenuationUse="1"

energyReductionRate="20">

<green:QualityInfo metric="PSNR" reduction="5" />

<green:ApproxModel type="0" />

</green:AMI>

</EssentialProperty>

</Representation>

 

</AdaptationSet>

 

</Period>

</MPD>

      1. Example 2

This example demonstrates a DASH MPD with a presentation that includes a video Adaptation Set with three Representations and a Display Attenuation Map Adaptation Set with two Representations that are associated with the first video Representation. The display attenuation map carried by the first Representation of the Display Attenuation Map Adaptation Set results in a 20 % reduction in energy consumption when applied to the video Representation, while the second Representation of the Display Attenuation Map Adaptation Set results in a 40 % reduction in energy consumption when applied to the same video Representation.

<?xml version=”1.0” encoding=”UTF-8”?>

<MPD

xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

xmlns="urn:mpeg:DASH:schema:MPD:XXXX"

xmlns:green="urn:mpeg:mpegI:green:2025"

xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

type="dynamic"

minimumUpdatePeriod="PT2S"

timeShiftBufferDepth="PT30M"

availabilityStartTime="2011-12-25T12:30:00"

minBufferTime="PT4S"

profiles="urn:mpeg:dash:profile:isoff-live:2011">

 

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

 

<Period>

 

<!-- Video -->

<AdaptationSet id="video" mimeType="video/mp4" codecs="avc1.4D401F"

frameRate="30000/1001" segmentAlignment="true" startWithSAP="1">

<BaseURL>video/</BaseURL>

<SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="v0" width="320" height="240" bandwidth="250000"/>

<Representation id="v1" width="640" height="480" bandwidth="500000"/>

<Representation id="v2" width="960" height="720" bandwidth="1000000"/>

</AdaptationSet>

 

<!-- Display Attenuation Map Adaptation Set for energy reduction for video -->

<AdaptationSet id="ami" mimeType="video/mp4" codecs="resv.gmat.avc1.4D401F">

 

<BaseURL>attenuation_maps/</BaseURL>

<SegmentTemplate timescale="90000" media="$id$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

 

<Representation id="ami0" bandwidth="1000" associationId="v0"

associationType="amit">

<EssentialProperty

schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

<green:AMI displayMode="3" attenuationUse="1"

energyReductionRate="20">

<green:ApproxModel type="1" />

</green:AMI>

</EssentialProperty>

</Representation>

 

<Representation id="ami1" bandwidth="1000" associationId="v0"

associationType="amit">

<EssentialProperty

schemeIdUri="urn:mpeg:mpegI:green:2025:attenuationMap">

<green:AMI displayMode="3" attenuationUse="1"

energyReductionRate="40">

<ami:ApproxModel type="1" />

</green:AMI>

</EssentialProperty>

</Representation>

 

</AdaptationSet>

 

</Period>

</MPD>

      1. Example 3

This example is similar to Example 2 in subclause B.3.2, but it uses one instance of the descriptor at the Adaptation Set level containing common information for all Representations, and additional descriptor instances with representation-specific information for each Representation.

<?xml version="1.0" encoding="UTF-8"?>

<MPD

xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

xmlns="urn:mpeg:DASH:schema:MPD:XXXX"

xmlns:green="urn:mpeg:mpegI:green:2025"

xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

type="dynamic"

minimumUpdatePeriod="PT2S"

timeShiftBufferDepth="PT30M"

availabilityStartTime="2011-12-25T12:30:00"

minBufferTime="PT4S"

profiles="urn:mpeg:dash:profile:isoff-live:2011">

 

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

 

<Period>

 

<!-- Video -->

<AdaptationSet id="video" mimeType="video/mp4" codecs="avc1.4D401F"

frameRate="30000/1001" segmentAlignment="true" startWithSAP="1">

<BaseURL>video/</BaseURL>

<SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

<Representation id="v0" width="320" height="240" bandwidth="250000"/>

<Representation id="v1" width="640" height="480" bandwidth="500000"/>

<Representation id="v2" width="960" height="720" bandwidth="1000000"/>

</AdaptationSet>

 

<!-- Display Attenuation Map Adaptation Set for energy reduction for video -->

<AdaptationSet id="ami" mimeType="video/mp4" codecs="resv.gmat.avc1.4D401F">

 

<EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

<green:AMI displayMode="3" attenuationUse="1" />

</EssentialProperty>

 

<BaseURL>attenuation_maps/</BaseURL>

<SegmentTemplate timescale="90000" media="$id$/$Time$.mp4v">

<SegmentTimeline>

<S t="0" d="180180" r="432"/>

</SegmentTimeline>

</SegmentTemplate>

 

<Representation id="ami0" bandwidth="1000" associationId="v0"

associationType="amit">

<EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

<green:AMI energyReductionRate="20">

<green:ApproxModel type="1" />

</green:AMI>

</EssentialProperty>

</Representation>

 

<Representation id="ami1" bandwidth="1000" associationId="v0"

associationType="amit">

<EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

<green:AMI energyReductionRate="40">

<green:ApproxModel type="1" />

</green:AMI>

</EssentialProperty>

</Representation>

</AdaptationSet>

 

</Period>

</MPD>

      1. Example 4

This example demonstrates a DASH MPD with a presentation that includes a video Adaptation Set with three Representations and two Display Attenuation Map Adaptation Sets with two Representations that are associated with the first video Representation. The display attenuation map carried by the first Representation of the Display Attenuation Map Adaptation Set results in a 20 % reduction in energy consumption when applied to the video Representation, while the second Representation of the Display Attenuation Map Adaptation Set results in a 40 % reduction in energy consumption when applied to the same video Representation. Two Preselections are defined to signal two experiences, each grouping of the video Adaptation Set with one of the two Display Attenuation Map Adaptation Sets.

<?xml version="1.0" encoding="UTF-8"?>

<MPD

   xmlns:xsi="https://www.w3.org/2001/XMLSchema-instance"

   xmlns="urn:mpeg:DASH:schema:MPD:2011"

   xmlns:green="urn:mpeg:mpegI:green:2025"

   xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:xxxx"

   type="static"

   minBufferTime="PT4S"

   profiles="urn:mpeg:dash:profile:isoff-on-demand:2011">

 

<BaseURL>http://cdn1.example.com/</BaseURL>

<BaseURL>http://cdn2.example.com/</BaseURL>

 

<Period>

 

   <!-- Video -->

   <AdaptationSet id="video" mimeType="video/mp4" codecs="avc1.4D401F"

      frameRate="30000/1001" segmentAlignment="true" startWithSAP="1">

      <BaseURL>video/</BaseURL>

      <SegmentTemplate timescale="90000" media="$Bandwidth$/$Time$.mp4v">

         <SegmentTimeline>

            <S t="0" d="180180" r="432"/>

         </SegmentTimeline>

      </SegmentTemplate>

      <Representation id="v0" width="320" height="240" bandwidth="250000"/>

      <Representation id="v1" width="640" height="480" bandwidth="500000"/>

      <Representation id="v2" width="960" height="720" bandwidth="1000000"/>

   </AdaptationSet>

 

   <AdaptationSet id="am1" mimeType="video/mp4" codecs="resv.gmat.avc1.4D401F">

 

      <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

         <green:AMI displayMode="3" attenuationUse="1">

            <green:ApproxModel type="0” />

         </green:AMI>

      </EssentialProperty>

 

      <BaseURL>attenuation_maps/</BaseURL>

      <SegmentTemplate timescale="90000" media="$id$/$Time$.mp4v">

         <SegmentTimeline>

            <S t="0" d="180180" r="432"/>

         </SegmentTimeline>

      </SegmentTemplate>

 

      <Representation id="ami1" bandwidth="1000" associationId="v0"

         associationType="amit" />

         <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

            <green:AMI energyReductionRate="20">

               <green:QualityInfo metric="PSNR" reduction="5" />

            </green:AMI>

         </EssentialProperty>

      </Representation>

 

      <Representation id="ami2" bandwidth="1000" associationId="v0"

         associationType="amit" />

         <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

            <green:AMI energyReductionRate="40">

               <green:QualityInfo metric="PSNR" reduction="10" />

            </green:AMI>

         </EssentialProperty>

      </Representation>

 

   </AdaptationSet>

 

   <AdaptationSet id="am2" mimeType="video/mp4" codecs="resv.gmat.avc1.4D401F">

 

      <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

         <green:AMI displayMode="3" attenuationUse="1">

            <green:ApproxModel type="1” />

         </green:AMI>

      </EssentialProperty>

 

      <BaseURL>attenuation_maps/</BaseURL>

      <SegmentTemplate timescale="90000" media="$id$/$Time$.mp4v">

         <SegmentTimeline>

            <S t="0" d="180180" r="432"/>

         </SegmentTimeline>

      </SegmentTemplate>

 

      <Representation id="ami3" bandwidth="1000" associationId="v0"

         associationType="amit" />

         <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

            <green:AMI energyReductionRate="20">

               <green:QualityInfo metric="PSNR" reduction="5" />

            </green:AMI>

         </EssentialProperty>

      </Representation>

 

      <Representation id="ami4" bandwidth="1000" associationId="v0"

         associationType="amit" />

         <EssentialProperty schemeIdUri="urn:mpeg:mpegI:green:2025:ami">

            <green:AMI energyReductionRate="40">

               <green:QualityInfo metric="PSNR" reduction="10" />

            </green:AMI>

         </EssentialProperty>

      </Representation>

 

   </AdaptationSet>

 

   <Preselection id="amip1" tag="1" preselectionComponents="video am1"

      codecs="avc1.4D401F">

      <Role schemeIdUri=" mpeg:mpegI:green:role:2025" value="ami" />

   </Preselection>

 

   <Preselection id="amip2" tag="2" preselectionComponents="video am2"

      codecs="avc1.4D401F">

      <Role schemeIdUri="mpeg:mpegI:green:role:2025" value="ami" />

   </Preselection>

 

</Period>

</MPD>


  1. (informative)

    Conformance and reference software
    1. Display power reduction using display adaptation
      1. Conformance test vectors

One conformance ISO BMFF file, BasketballDrill_28_gamma.mp4m, which contains green metadata samples of ‘dfce’ sample entry type, as specified in subclause 5.3.2, is available at https://standards.iso.org/iso-iec/23001/-11/ed-3/en.

It is composed of a sample entry which contains static metadata and samples which contain dynamic metadata.

To verify conformance of a software implementation of ‘dfce’ green metadata samples parsing in an ISO BMFF file, the conformance file shall be used to check that extracted values match expected values given in the side text file provided with the conformance file.

      1. Reference software

A reference software for parsing and display of ‘dfce’ green metadata samples in ISO BMFF file is available at https://standards.iso.org/iso-iec/23001/-11/ed-3/en.

It is linked with ISO BMFF reference software libraries (IsoLib), which are available in ISO/IEC 14496-5.

A readme.txt is provided to explain how to produce the executable in a Windows or Linux environment.

The reference software takes the ISO BMFF metadata file (*.mp4m) as input and produces a text file as output, which gives a full description of the metadata stored in the samples of the input file.

To verify conformance of test metadata files, the reference software shall be used to parse the test metadata files and to check them for syntactic correctness and valid ranges.

    1. Energy-efficient media selection
      1. Conformance test vectors

A conformance test vector for Decoder-Power Indication metadata is available at https://standards.iso.org/iso-iec/23001/-11/ed-3/en.

It consists of a set of:

— ten ISO BMFF video files, which provide ten AVC video Representations, with (sub)segments duration of 2 s, at the following resolutions and bitrates:

— 1920x1080p50 @ 10Mbps,

— 1920x1080p50 @ 8Mbps,

— 1600x900p50 @ 8Mbps,

— 1600x900p50 @ 6Mbps,

— 1280x720p50 @ 6Mbps,

— 1280x720p50 @ 5Mbps,

— 960x540p50 @ 5Mbps,

— 960x540p50 @ 3.5Mbps,

— 768x432p50 @ 3.5Mbps,

— 768x432p25 @ 2.5Mbps.

— ten ISO BMFF metadata files, which provide associated decoder-power indication (‘depi’) metadata Representation of each video Representation,

— a manifest file, conformant to ISO/IEC 23009-1.

The ISO BMFF metadata files contain green metadata samples of ‘depi’ sample entry type, as specified in clause 5.3.15.2.1.

To verify conformance of a software implementation of ‘depi’ green metadata samples parsing in an ISO BMFF file, the conformance metadata files shall be used to check that extracted values match expected values given in the side text files provided with the conformance files.

      1. Reference software

A reference software for parsing and display of Decoder-Power (‘depi’) or Display-Power (‘dipi’) Indication metadata in ISO BMFF file is available at https://standards.iso.org/iso-iec/23001/-11/ed-3/en.

It is linked with ISO BMFF reference software libraries (IsoLib), which are available in ISO/IEC 14496-5.

A readme.txt is provided to explain how to produce the executable in Windows or Linux environment.

The reference software takes the ISO BMFF metadata file (*.mp4m) as input and produces a text file as output, which gives a full description of the metadata stored in the samples of the input file.

To verify conformance of test metadata files, the reference software shall be used to parse the test metadata files and to check them for syntactic correctness and valid ranges.


  1. (informative)

    Generation and use of green metadata
    1. Decoder and display power indication
      1. Metadata generation at the server side

Given N video Representations, the Decoder-Power Indication metadata dec_ops_reduction_ratio_from_max(i) (DOR-Ratio-Max(i)) and dec_ops_reduction_ratio_from_prev(i) (DOR-Ratio-Prev(i)) are computed by the encoding system and provided by the server for i = 0 to N – 1, as shown in Figure D.1. The Display-Power Indication metadata is computed from one Representation.

Figure D.1 — Green metadata computation and insertion.

The DOR-Ratio-Max(i) associated with each video Representation i of a Segment is computed as the power-saving ratio from the most demanding video Representation produced for the Segment, as defined in subclause 8.4.1 of ISO/IEC 23001-11.

The DOR-Ratio-Prev(i) associated with each video Representation i of a Segment is computed as the power-saving ratio from the previous Segment of the same Representation, as defined in subclause 8.4.1 of ISO/IEC 23001-11.

To produce the normative green metadata DOR-Ratio-Max(i) and DOR-Ratio-Prev(i) for a given Segment, the encoding system needs to estimate the decoding complexity of each video Representation, as a number of processing cycles. Each sample which contains the DOR-Ratio values is then stored in a specific metadata file “$id$/$Time$.mp4m” (one for each Segment) using the format specified in subclause 5.2.

The Display Power Indication metadata is a list of (ms_num_quality_levels + 1) pairs of the form (ms_max_rgb_component[ i ], ms_scaled_psnr_rgb[ i ]) as defined in 8.4.1. This metadata is produced without considering any constraint on max_variation, the maximal backlight variation between two successive frames. It is also assumed that the backlight can be updated on each frame so that constant_backlight_voltage_time_interval is the inter-frame interval. Therefore, the Display Power-Indication metadata provides the maximum power saving for a given quality level.

The Display Power Indication metadata is stored in a specific metadata file “$id$/$Time$.mp4m” (one for each Segment) using the format specified in subclause 5.3.

      1. Use of metadata at the client

The client (player/decoder) can determine its remaining battery life based on the energy consumption of the current Representation it is using. If it detects that its battery life is insufficient for the total duration of the video content to be consumed (given parameter in the server or requirements of duration expressed by the user), the terminal can compute the power consumption saving ratio from the current Representation.

Using the following information, the terminal can determine (for the next Segment) the best power-saving allocation strategy for the decoder and for the display:

— the decoder-power saving ratio of all available video Representations in the next Segment from the current (selected) Representation in the previous Segment,

— the impact of RGB component scaling on video quality for the next Segment,

— for the last Segments, the decoder and display consumption as a fraction of the total consumption.

From this information, the terminal determines which Representation it needs to download and what is the appropriate scaling of RGB components for this Representation.

It is observed that the decoder-power saving ratio of all available video Representations from the current Representation in the previous Segment is not directly given by the power-indication metadata. At the server, what is given is a list of two decoder operations reduction ratios per video Representation:

— the first one is the ratio of each Representation from the most energy-consuming one at a given period of time T (dashed arrows in Figure D.2 —),

— the second one is the ratio of each Representation at a given period of time T from the previous period of time T – 1 (continuous arrow in Figure D.3 —).

The terminal can convert this list of ratios into a list of ratios from the current Representation it was using in the previous Segment.

Let us define the following terms:

IrefRep the index, in the current Segment, of the Representation which was used by the client terminal in the previous Segment,

— dec_ops_reduction_ratio_from_max(i) the reduction ratio from the most energy consuming Representation, received from the server,

RdecOpsReducFromRepRef(i) the reduction ratio from Representation RefRep in the current Segment,

RdecOpsReducFromPrevRepRef(i) the reduction ratio from Representation RefRep in the previous Segment,

It is possible to express RdecOpsReducFromRepRef(i) from dec_ops_reduction_ratio_from_max(i), using the following formula:

 

(D.1)

RdecOpsReducFromRepRef(i) are represented by dotted arrows in Figure D.2 —. It is then possible to express RdecOpsReducFromPrevRepRef(i) from RdecOpsReducFromRepRef(i), using the following formula:

(D.2)

RdecOpsReducFromPrevRepRef(i) are represented by dashed arrows in Figure D.3 —.

NOTE 1 Floating-point numbers are used for these computations.

Figure D.2 — Derivation of DecOpsReductionRatios within the current Segment.

Figure D.3 — Derivation of DecOpsReductionRatios within the current Segment from the previous Segment.

Using the mapping between Processing frequency of processors or devices and Power Supply Voltage and the mapping between Power Supply Voltage and Power consumption, the terminal can translate this list into a list of decoder-power saving ratios from the Representation which was used in the previous Segment.

In the case where the total duration of the video content to be consumed is not known (case of live content for example), the terminal can display the expected remaining usage duration based on current battery level and the energy consumption of the current Representation it is using. The user can therefore act on its terminal to increase this usage duration, which are translated into a power saving ratio as in the previous case.

NOTE 2 Complexity metrics, as defined in subclause 6.2 of ISO/IEC 23001-11 can be sent with each Representation to allow the client to save energy by proactively invoking C-DVFS to make the selection of a Representation work at its best for energy saving.

NOTE 3 The dec_ops_reduction_ratio is known to be stable across software-based platforms.

    1. Display attenuation maps
      1. Metadata generation at the server side

Given a video Representation, a set of N display attenuation maps is generated by the encoding system provided by the server to reduce locally and smartly the brightness of the video Representation’s frames. Each display attenuation map i optimizes the trade-off between the resulting video quality and energy reduction, for i = 0 to N – 1, as shown in Figure D.4 —.

Figure D.4 — Display attenuation map computation and insertion.

The display attenuation map associated with a video Representation is computed to reduce the energy consumption of displaying this Representation by an energy-reduction rate indicated as a reduction percentage.

A display attenuation map Media Segment and its associated video Media Segment(s) are time aligned on Segment boundaries. A display attenuation map Media Segment is an ISOBMFF file which contains samples for a restricted video track, as shown in Figure D.5 —, where each sample is a coded display attenuation map frame. The carriage of display attenuation map data in ISOBMFF files is specified in subclause 5.3.3.

Figure D.5 — Alternate group of multiple display attenuation map representations.

      1. Use of metadata at the client

A DASH client is guided by the information provided in the MPD. The following is an example client behaviour for streaming videos with associated display attenuation maps using the signalling presented in subclause 6.4.

The client first issues an HTTP request and downloads the MPD file from the content server. It then parses the MPD file to generate a corresponding in-memory Representation of the XML elements in the MPD file.

To identify available display attenuation maps in a Period, the streaming client scans the AdaptationSet elements to find Adaptation Sets with an AttenuationMap descriptor whose @schemeIdUri is set to the unique URI "urn:mpeg:mpegI:green:2025:ami". For each Representation in the Display Attenuation Map Adaptation Set, the client also identifies the associated Representation in the video Adaptation Set using the @associationId.

The streaming client selects one of the Representations of the video Adaptation Set based on its capabilities and the network conditions and downloads the Initialization Segment for that Representation. It then downloads the Initialization Segment for all Representations from the Display Attenuation Map Adaptation Set that are associated with the selected video Representation. The client then starts sequentially downloading Media Segments from the video Representation. The client regularly monitors the remaining power in the device’s battery and based on the battery level, remaining playback time, and the information signalled in the AttenuationMap descriptor selects one of the display attenuation map Representations.

The client subsequently downloads with each Media Segment from the video Representation a corresponding Media Segment from the display attenuation map Representation. The downloaded display attenuation map Media Segments are decoded, and the decoded frames are applied to the corresponding decoded frames from the video before rendering.

Bibliography

[1] ISO/IEC 23009‑3, Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 3: Implementation guidelines

espa-banner