WebCodecs

Draft Community Group Report,

This version:
https://wicg.github.io/web-codecs/
Issue Tracking:
GitHub
Inline In Spec
Editors:
Chris Cunningham (Google Inc.)
Paul Adenot (Mozilla)
Participate:
Git Repository.
File an issue.
Version History:
https://github.com/wicg/web-codecs/commits

Abstract

WebCodecs is a flexible web API for encoding and decoding audio and video.

Status of this document

This specification was published by the Web Platform Incubator Community Group. It is not a W3C Standard nor is it on the W3C Standards Track. Please note that under the W3C Community Contributor License Agreement (CLA) there is a limited opt-out and other conditions apply. Learn more about W3C Community and Business Groups.

1. Introduction

The API defined in this document provides efficient access to built-in (software and hardware) media encoders and decoders for encoding and decoding media. There are many Web APIs that use media codecs internally to support like HTMLMediaElement, WebAudio, MediaRecorder and WebRTC. Inspite of the wide spread usage, there has been a lack of support for configuring the media codecs. As a consequence, many web applications have resorted to implementing media codecs in JavaScript or WebAssembly. This results in reduced power efficiency, reduced performance and increased bandwidth to download a media codec already present in the browser. A comprehensive list of applicable use cases, and code examples can be found in [WEBCODECS-USECASES] and [WEBCODECS-EXAMPLES] explainer documents.

2. Scope

The current scope of this specification are the platform and software codecs which are commonly present in modern day browsers. Media apps might need to work with a particular type of file or media containers like MP4 or Webm by using a muxer or demuxer. Usages like that are currently out of scope for WebCodecs. Writing codecs in JavaScript or WebAssembly is definitely out of scope for WebCodecs. In fact, with support for WebCodecs, the need to write codecs in JavaScript or WebAssembly should ideally be restricted only to support legacy codecs or emulate support for new and experimental codecs. Images are mostly decoded using the same codecs as video even though there might be tight coupling between the container and the encoded data. There is a possibility that we might consider ImageDecoder as part of WebCodecs in future. Image encoding is presently out of scope for this document.

3. Background

This section is non-normative.

4. Use Cases

This section is non-normative.

This section provides a collection of use cases and usage scenarions for web pages and applications using WebCodecs.

5. Security and Privacy Considerations

6. Model

An EncodedAudioChunks and EncodedVideoChunks contain codec-specific encoded media bytes. An EncodedVideoChunk contains a single encoded video frame along with metadata related to the frame, for example timestamp.

AudioPacket contains decoded audio data. It will provide an AudioBuffer for rendering via AudioWorklet.

A VideoFrame contains decoded video data. It can be drawn into Canvas with drawImage or rendered into WebGL texture with texImage2D.

Support VideoFrame creation from yuv data? See WICG#45

An AudioEncoder encodes AudioPackets to produce EncodedAudioChunks.

A VideoEncoder encodes VideoFrames to produce EncodedVideoChunks.

An AudioDecoder decodes EncodedAudioChunks to produce AudioPackets.

A VideoDecoder decodes EncodedVideoChunks to produce VideoFrames.

WebCodecs API also has mechanisms to import content referenced through a valid MediaStreamTrack, for example from getUserMedia.

The term platform decoder refers to the platform interfaces with which the user agent interacts to obtain a decoded VideoFrame. The platform decoder can be defined by the underlying platform (e.g native media framework).

The term platform encoder refers to the platform interfaces with which the user agent interacts to encode a VideoFrame. The platform encoder can be defined by the underlying platform (e.g native media framework).

7. WebCodecs API

7.1. VideoFrame interface

dictionary VideoFrameInit {
  unsigned long long timestamp;  // microseconds
  unsigned long long? duration;  // microseconds
};
[Exposed=(Window)]
interface VideoFrame {
    constructor(VideoFrameInit init, ImageBitmap source);
    void release();
    [NewObject] Promise<ImageBitmap> createImageBitmap(optional ImageBitmapOptions options = {});
    readonly attribute unsigned long long timestamp;  // microseconds
    readonly attribute unsigned long long? duration;  // microseconds
    readonly attribute unsigned long codedWidth;
    readonly attribute unsigned long codedHeight;

    readonly attribute unsigned long visibleWidth;
    readonly attribute unsigned long visibleHeight;
};

Instances of VideoFrame are created with the internal slots described in the following table:

Internal Slot Description (non-normative)
[[frame]] Contains image data for the VideoFrame

The timestamp attribute in VideoFrame object represents the sampling instant of the first data of the VideoFrame in microseconds from a fixed origin. Initial value should be specified by timestamp during VideoFrame construction.

The duration is an attribute in VideoFrame object that represents the time interval for which the video composition should render the composed VideoFrame in microseconds.

The codedWidth attribute in VideoFrame object denotes the number of pixel samples stored horizontally for each frame.

The codedHeight attribute in VideoFrame object denotes the number of pixel samples stored vertically for each frame.

The visibleWidth attribute in VideoFrame object denotes the number of pixel samples horizontally which should be visible to the user.

The visibleHeight attribute in VideoFrame object denotes the number of pixel samples vertically which should be visible to the user.

Note: An image’s clean aperture is a region of video free from transition artifacts caused by the encoding of the signal. This is the region of video that should be displayed. visibleWidth and visibleHeight denote the frame’s clean aperture region. The clean aperture is usually in the center of production aperture which might contain some details along the edges of the image. The codedWidth and codedHeight constitute the production aperture of the image.

How to express encoded size vs. visible size vs. natural size WICG#26

7.1.1. Create VideoFrame

input

init, a dictionary object of type VideoFrameInit

source, a ImageBitmap object.

output

frame_instance, a VideoFrame object.

These steps are run in the constructor of a new VideoFrame object:

  1. If source is null:

    1. Throw "NotFoundError" DOMException and abort these steps.

  2. Set codedWidth equal to source.width.

  3. Set codedheight equal to source.height.

  4. Set visibleWidth equal to source.width.

  5. Set visibleHeight equal to source.height.

  6. Set timestamp equal to init.timestamp.

  7. If init.duration is set, set duration equal to init.duration.

  8. Allocate sufficiently memory for [[frame]] and copy the image data from source into it.

7.1.2. VideoFrame.createImageBitmap() method

output

p, a Promise object.

The createImageBitmap method must run these steps:
  1. Let p be a new Promise object.

  2. If videoframe does not contain a valid frame:

    1. Reject p with an "InvalidAccessError" DOMException.

    2. Return p and abort these steps.

  3. Let imageBitmap be a new ImageBitmap object.

  4. Set imageBitmap’s bitmap data from [[frame]].

  5. Run this step in parallel:

    1. Resolve p with imageBitmap.

  6. Return p.

7.1.3. VideoFrame.release() method

The release() method must run these steps:
  1. Release all resources allocated to [[frame]].

  2. Set all attributes equal to zero.

[ttoivone] From the Javascript perspective, release is not needed but it could just clear all references to a VideoFrame object and let garbage collector to release the memory. Should guidelines be given here when the function needs to be called explicitly?

7.2. EncodedVideoChunk interface

enum EncodedVideoChunkType {
  "key",
  "delta",
};
interface EncodedVideoChunk {
  constructor(EncodedVideoChunkType chunk_type, unsigned long long chunk_timestamp, BufferSource chunk_data);
  constructor(EncodedVideoChunkType chunk_type, unsigned long long chunk_timestamp, unsigned long long chunk_duration, BufferSource chunk_data);
  readonly attribute EncodedVideoChunkType type;
  readonly attribute unsigned long long timestamp;  // microseconds
  readonly attribute unsigned long long? duration;  // microseconds
  readonly attribute ArrayBuffer data;
};

The type attribute in EncodedVideoChunk is set to key if the encoded video frame stored in the chunk is a key frame (ie. the frame can be decoded independently without referring to other frames) or otherwise it is set to delta.

The timestamp attribute in EncodedVideoChunk object represents the sampling instant of the data in the EncodedVideoChunk in microseconds from a fixed origin.

The duration is an attribute in EncodedVideoChunk object that represents the time duration for which the video composition should render the composed EncodedVideoChunk in microseconds.

The data attribute in EncodedVideoChunk stores the video frame in encoded form.

7.2.1. Create EncodedVideoChunk

input

chunk_type, a EncodedVideoChunkType object.

chunk_timestamp, a unsigned long long value.

chunk_duration, a unsigned long long value (optional).

chunk_data, a BufferSource object.

output

decoder_instance, a VideoDecoder object.

These steps are run in the constructor of a new VideoFrame object:
  1. If chunk_data is not a valid BufferSource:

    1. Throw "NotFoundError" DOMException and abort these steps.

  2. Set type equal to chunk_type.

  3. Set timestamp equal to chunk_timestamp.

  4. Set duration equal to chunk_duration if chunk_duration argument is given to the constructor.

  5. Create a new ArrayBuffer data and copy bytes from chunk_data into it.

7.3. AudioPacket interface

7.4. EncodedAudioChunk interface

7.5. WebCodecs callbacks

callback WebCodecsErrorCallback = void (DOMException error);
callback VideoFrameOutputCallback = void (VideoFrame output);
callback VideoEncoderOutputCallback = void (EncodedVideoChunk chunk);

7.6. VideoDecoder interface

[Exposed=(Window)]
interface VideoDecoder {
    constructor(VideoDecoderInit init);

    Promise<void> configure(EncodedVideoConfig config);
    Promise<void> decode(EncodedVideoChunk chunk);
    Promise<void> flush();
    Promise<void> reset();

    readonly attribute long decodeQueueSize;
    readonly attribute long decodeProcessingCount;
};
dictionary VideoDecoderInit {
  VideoFrameOutputCallback output;
  WebCodecsErrorCallback error;
};
dictionary EncodedVideoConfig {
  required DOMString codec;
  BufferSource description;
  double sampleAspect;
};

A VideoDecoder object processes a queue of configure, decode, and flush requests. Requests are taken from the queue sequentially but may be processed concurrently. A VideoDecoder object has an associated platform decoder.

7.6.1. VideoDecoder.decodeQueueSize

The decodeQueueSize attribute in VideoDecoder object denotes the number of queued decode requests, excluding those that are already being processed or have finished processing. Applications can minimize underflow by enqueueing decoding requests until decodeQueueSize is sufficiently large.

7.6.2. VideoDecoder.decodeProcessingCount

The decodeProcessingCount attribute in VideoDecoder object denotes the number of decode requests currently being processed. Applications can minimize resource consumption and decode latency by enqueueing decode requests only when decodeQueueSize and decodeProcessingCount are small.

7.6.3. VideoDecoder Callbacks

The VideoDecoderOutputCallback, denoted by output, is for emitting VideoFrames. The WebCodecsErrorCallback, denoted by error, is for emitting decode errors.

7.6.4. VideoDecoder internal slots

Instances of VideoDecoder are created with the internal slots described in the following table:

Internal Slot Description (non-normative)
[[request]] First in first out list (FIFO) for storing the requests which can be one of the following type "configure", "decode", "flush", or "reset". The initial task type should be "configure" and the queue should initially be empty.
[[requested_decodes]] An integer representing the number of decode requests currently being processed for the associated platform decoder. It is initially set to 0.
[[requested_resets]] An integer representing the number of reset requests currently being processed for the associated platform decoder. It is initially set to 0.
[[pending_encodes]] A list representing the number of pending decode requests currently being processed for the associated platform decoder. It is initially empty.
[[platform_decoder]] A reference to the platform interfaces with which the user agent interacts to obtain a decoded VideoFrame. Platform decoder can be defined by the underlying platform (e.g native media framework). It is initially unset.
[[output_callback]] A callback which is called when VideoDecoder finishes decoding and now has a decoded VideoFrame as an output.
[[error_callback]] A callback which is called when VideoDecoder encounters an error while decoding.
[[configured]] Boolean flag whether the VideoDecoder has been configured.

7.6.5. Create VideoDecoder

input

init, a VideoDecoderInit object.

output

decoder_instance, a VideoDecoder object.

These steps are run in the constructor of a new VideoDecoder object:

  1. Set [[requested_decodes]] to 0.

  2. Set [[requested_resets]] to 0.

  3. Set [[pending_encodes]] to empty list.

  4. Set [[request]] to empty list.

  5. Set [[platform_decoder]] to null.

  6. Set [[output_callback]] to init.output.

  7. Set [[error_callback]] to init.error.

  8. Set [[configured]] to false.

7.6.6. VideoDecoder.configure() method

input

config, a EncodedVideoConfig object.

output

p, a Promise object.

The configure() method must run these steps:

  1. Let p be a new Promise object.

  2. Perform codec validation:

    1. If config.codec is null:

      1. Reject p with an "NotAllowedError" DOMException.

      2. Return p and abort these steps.

    2. If config.codec is not among the set of allowed codecs:

      1. Reject p with an "NotSupportedError" DOMException.

      2. Return p and abort these steps.

  3. If there doesn’t exist a platform decoder which can fullfill the requirements set in config:

    1. Reject p with an "NotSupportedError" DOMException.

    2. Return p and abort these steps.

  4. Run these steps in parallel:

    1. Flush decode requests:

      1. Let pending be the union of the sets of items in [[pending_encodes]] and [[request]].

      2. Wait until any of the items in pending is not in [[pending_encodes]] or [[request]].

    2. Set [[platform_decoder]] to point to the matching platform decoder.

    3. Set [[configured]] to true.

    4. Resolve p.

  5. Return p.

Note: After the configure() call, the decoder is in the newly initialized state so the next chunk to be decoded must be a keyframe.

7.6.7. VideoDecoder.decode() method

input

chunk, a EncodedVideoChunk object.

output

p, a Promise object.

The decode() method must run these steps:

  1. Let p be a new Promise object.

  2. If [[platform_decoder]] is null:

    1. Reject p with an "InvalidStateError" DOMException.

    2. Return p and abort these steps.

  3. If chunk.type is not key and [[platform_decoder]] is newly initialized:

    1. Reject p with an "InvalidStateError" DOMException.

    2. Return p and abort these steps.

  4. If [[platform_decoder]] can accept more work:

    1. Add chunk into [[pending_encodes]].

    2. Let video_frame be a new instance of VideoFrame and associate it with chunk.

    3. [[platform_decoder]] should start decoding chunk into video_frame.

    4. Increment decodeProcessingCount by 1.

  5. Otherwise:

    1. Add chunk at the end of [[request]] queue.

    2. Increment decodeQueueSize by 1.

  6. Run this step in parallel:

    1. Resolve p.

  7. Return p.

Note: get backpressure idea from decodeQueueSize) to know if [[platform_decoder]] can accept new work right away with decodeProcessingCount.

7.6.8. VideoDecoder.flush() method

output

p, a Promise object.

The flush() method must run these steps:

  1. Let p be a new Promise object.

  2. Flush decode requests:

    1. Let pending be the union of the sets of items in [[pending_encodes]] and [[request]].

    2. Wait until any of the items in pending is not in [[pending_encodes]] or [[request]].

  3. Run this step in parallel:

    1. Resolve p.

  4. Return p.

7.6.9. VideoDecoder.reset() method

output

p, a Promise object.

The reset() method must run these steps:

  1. Let p be a new Promise object.

  2. Abort all work performed by [[platform_decoder]]. VideoDecoderOutputCallback will not be called for them.

  3. Remove all items from [[pending_encodes]].

  4. Remove all items from [[request]]

  5. Set [[requested_decodes]] to 0 and [[pending_encodes]] to empty.

  6. Set decodeProcessingCount to 0.

  7. Set decodeQueueSize to 0.

  8. Set [[configured]] to false.

  9. Run this step in parallel:

    1. Resolve p.

  10. Return p.

Note: After the reset() call, the decoder is in the newly initialized state so the next chunk to be decoded must be a keyframe.

7.7. VideoEncoder interface

dictionary VideoEncoderTuneOptions {
  unsigned long long bitrate;
  double framerate;
  required unsigned long width;
  required unsigned long height;
};
dictionary VideoEncoderInit {
  required DOMString codec;
  DOMString profile;
  required VideoEncoderTuneOptions tuneOptions;
  required VideoEncoderOutputCallback output;
  WebCodecsErrorCallback error;
};
dictionary VideoEncoderEncodeOptions {
  boolean? keyFrame;
};
[Exposed=(Window)]
interface VideoEncoder {
  constructor();
  Promise<void> configure(VideoEncoderInit init);
  Promise<void> encode(VideoFrame frame, optional VideoEncoderEncodeOptions options);
  Promise<void> tune(VideoEncoderTuneOptions options);
  Promise<void> flush();
  Promise<void> close();
};

A VideoEncoder object processes a queue of configure, encode, tuning to new parameters and flush requests. Requests are taken from the queue sequentially but may be processed concurrently. A VideoEncoder object has an associated platform video encoder.

7.7.1. VideoEncoder internal slots

Instances of VideoEncoder are created with the internal slots described in the following table:

Internal Slot Description (non-normative)
[[request]] A double-ended queue for storing the requests which can be one of the following type "configure", "encode", "tune", "flush", or "close". It is initially "configure".
[[requested_encodes]] An integer representing the number of encode requests currently being processed for the associated platform encoder. It is initially unset.
[[requested_resets]] An integer representing the number of reset requests currently being processed for the associated platform encoder. It is initially unset.
[[pending_encodes]] A set representing the number of pending encode requests currently being processed for the associated platform encoder. It is initially unset.
[[platform_encoder]] A reference to the platform interfaces with which the user agent interacts to encode a VideoFrame. Platform encoder can be defined by the underlying platform (e.g native media framework).
[[tune_options]] VideoEncoderTuneOptions which are set most recently.

7.7.2. Create a VideoEncoder

output

encoder_instance, a VideoEncoder object.

These steps are run in the constructor of a new VideoEncoder object:
  1. Set [[requested_encodes]] to 0.

  2. Set [[requested_resets]] to 0.

  3. Set [[pending_encodes]] to empty set.

  4. Set [[request]] to empty set.

  5. Set [[platform_encoder]] to null.

  6. Set [[output_callback]] to null.

  7. Set [[error_callback]] to null.

7.7.3. VideoEncoder.configure() method

input

init, a VideoEncoderInit object.

output

p, a Promise object.

The configure() method must run these steps:
  1. Let p be a new Promise object.

  2. If init is not a valid VideoEncoderInit:

    1. Reject p with newly created TypeError.

    2. Return p and abort these steps.

  3. Set [[tune_options]] to init.tuneOptions.

  4. Set [[output_callback]] to init.output.

  5. Set [[error_callback]] to init.error.

  6. Run this step in parallel:

    1. Resolve p.

  7. Return p.

7.7.4. VideoEncoder.encode() method

input

frame, a VideoFrame object.

options, a VideoEncoderEncodeOptions object (optional).

output

p, a Promise object.

The encode() method must run these steps:
  1. Let p be a new Promise object.

  2. If options is null, set options to {}.

  3. Let request to be a triplet {frame, options, [[tune_options]]}.

  4. If [[platform_encoder]] is null:

    1. Reject p with an "InvalidStateError" DOMException.

    2. Abort these steps and return p.

  5. If [[platform_encoder]] can accept more work:

    1. Add request into [[pending_encodes]].

    2. Increment [[requested_encodes]] by 1.

    3. Start encoding request.frame with [[platform_encoder]] using request.[[tune_options]]} and request.options.

  6. Otherwise:

    1. Add request at the back of [[request]] queue.

    2. Increment [[pending_encodes]] by 1.

  7. Run this step in parallel:

    1. Resolve p.

  8. Return p.

7.7.5. VideoEncoder.tune() method

input

options, a VideoEncoderTuneOptions object.

output

p, a Promise object.

The tune() method must run these steps:
  1. Let p be a new Promise object.

  2. Let tune_options an instance of VideoEncoderTuneOptions be the first argument.

  3. If the parameters in tune_options are not valid:

    1. Reject p with an "NotSupportedError" DOMException.

  4. Return p and abort these steps.

  5. Set [[tune_options]] to tune_options.

  6. Run this step in parallel:

    1. Resolve p.

  7. Return p.

7.7.6. VideoEncoder.flush() method

output

p, a Promise object.

The flush() method must run these steps:
  1. Let p be a new Promise object.

  2. Flush encode requests:

    1. Let pending be the union of the sets of items in [[pending_encodes]] and [[request]].

    2. Wait until any of the items in pending is not in [[pending_encodes]] or [[request]].

  3. Run this step in parallel:

    1. Resolve p.

  4. Return p.

7.7.7. VideoEncoder.close() method

output

p, a Promise object.

The close() method must run these steps:
  1. Let p be a new Promise object.

  2. Flush encode requests:

    1. Let pending be the union of the sets of items in [[pending_encodes]] and [[request]].

    2. Wait until any of the items in pending is not in [[pending_encodes]] or [[request]].

  3. Set the [[platform_encoder]] to null.

  4. Run this step in parallel:

    1. Resolve p.

  5. Return p.

7.8. AudioDecoder interface

7.9. AudioEncoder interface

8. Examples

This code sample illustrates of video rendering to Canvas for extremely low-latency streaming (e.g. cloud gaming)
// App provides stream of encoded chunks to decoder.
function streamEncodedChunks(decodeCallback) { ... }

// The document contains a canvas for displaying VideoFrames.
const canvasElement = document.getElementById("canvas");
const canvasContext = canvasElement.getContext('bitmaprenderer');

function paintFrameToCanvas(videoFrame) {
  // Paint every video frame ASAP for lowest latency.
  canvasContext.transferFromImageBitmap(videoFrame.transferToImageBitmap());
}

const videoDecoder = new VideoDecoder({
  output: paintFrameToCanvas,
  error: console.error("Decode Error")
});

videoDecoder.configure({codec: 'vp8'}).then(() => {
  // The app fetches VP8 chunks, feeding each chunk to the decode
  // callback as fast as possible. Real apps must also monitor
  // decoder backpressure to ensure the decoder is keeping up.
  streamEncodedChunks(videoDecoder.decode.bind(videoDecoder));
}).catch(() => {
  // App provides fallback logic when config not supported.
  ...
});
This code sample illustrates transcoding or offline encode/decode.
// App demuxes (decontainerizes) input and makes repeated calls to the provided
// callbacks to feed the decoders.
function streamEncodedChunks(decodeAudioCallback, decodeVideoCallback) { ... }

// App provides a way to demux  and mux (containerize) media.
function muxAudio(encodedChunk) { ... }
function muxVideo(encodedChunk) { ... }

// The app provides error handling (e.g. shutdown w/ UI message)
function onCodecError(error) { ... }

// Returns an object { audioEncoder, videoEncoder }.
// Encoded outputs sent immediately to app provided muxer.
async function buildAndConfigureEncoders() {
  // Build encoders.
  let audioEncoder = new AudioEncoder({ output: muxAudio, error: onCodecError });
  let videoEncoder = new VideoEncoder({ output: muxVideo, error: onCodecError });

  // Configure and reset if not supported. More sophisticated fallback recommended.
  try {
    await audioEncoder.configure({ codec: 'opus', ... });
  } catch (error) {
    audioEncoder = null;
  }
  try {
    await videoEncoder.configure({ codec : 'vp8', ... });
  } catch (error) {
    videoEncoder = null;
  }

  return {audioEncoder, videoEncoder };
}

// Returns an object { audioDecoder, videoDecoder }.
// Decoded outputs sent immediately to the coresponding encoder for re-encoding.
async function buildAndConfigureDecoders(audioEncoder, videoEncoder) {
  // Bind encode callbacks.
  const reEncodeAudio = audioEncoder.encode.bind(audioEncoder);
  const reEncodeVideo = videoEncode.encode.bind(videoEncoder);

  // Build decoders.
  const audioDecoder = new AudioDecoder({ output: reEncodeAudio, error: onCodecError });
  const videoDecoder = new VideoDecoder({ output: reEncodeVideo, error: onCodecError });

  // Configure and reset if not supported. More sophisticated fallback recommended.
  try {
    await audioDecoder.configure({ codec: 'aac', ... });
  } catch (error) {
    audioDecoder = null;
  }
  try {
    await videoDecoder.configure({ codec : 'avc1.42001e', ... });
  } catch (error) {
    videoDecoder = null;
  }

  return { audioDecoder, videoDecoder};
}

// Setup encoders.
let { audioEncoder, videoEncoder } = await buildAndConfigureEncoders();

// App handles unsupported configuration.
if (audioEncoder == null || videoEncoder == null)
  return;

// Setup decoders. Provide encoders to receive decoded output.
let { audioDecoder, videoDecoder } = await buildAndConfigureDecoders(audioEncoder, videoEncoder);

// App handles unsupported configuration.
if (audioDecoder == null || videoDecoder == null)
  return;

// Start streaming encoded chunks to the decoders, repeatedly calling
// the provided callbacks for each chunk.
// Decoded output will be fed to encoders for re-encoding.
// Encoded output will be fed to muxer.
streamEncodedChunks(audioDecoder.decode.bind(audioDecoder),
          videoDecoder.decode.bind(videoDecoder));
This code sample illustrates real-time communication using SVC.
  videoEncoder.configure({
  codec : 'vp9',
  tuning: {
    bitrate: 1_000_000,
    framerate: 24,
    width: 1024,
    height: 768
  }
  // Two spatial layers with two temporal layers each
  layers: [{
      // Quarter size base layer
      id: 'p0',
      temporalSlots: [0],
      scaleDownBy: 2,
      dependsOn: ['p0'],
    }, {
      id: 'p1'
      temporalSlots: [1],
      scaleDownBy: 2,
      dependsOn: ['p0'],
    }, {
      id: 's0',
      temporalSlots: [0],
      dependsOn: ['p0', 's0'],
    }, {
      id: 's1',
      temporalSlots: [1],
      dependsOn: ['p1', 's0', 's1'],
    }],
});

9. Acknowledgements

The following people have greatly contributed to this specification through extensive discussions on GitHub:

Conformance

Conformance requirements are expressed with a combination of descriptive assertions and RFC 2119 terminology. The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in the normative parts of this document are to be interpreted as described in RFC 2119. However, for readability, these words do not appear in all uppercase letters in this specification.

All of the text of this specification is normative except sections explicitly marked as non-normative, examples, and notes. [RFC2119]

Examples in this specification are introduced with the words “for example” or are set apart from the normative text with class="example", like this:

This is an example of an informative example.

Informative notes begin with the word “Note” and are set apart from the normative text with class="note", like this:

Note, this is an informative note.

Index

Terms defined by this specification

Terms defined by reference

References

Normative References

[HTML]
Anne van Kesteren; et al. HTML Standard. Living Standard. URL: https://html.spec.whatwg.org/multipage/
[RFC2119]
S. Bradner. Key words for use in RFCs to Indicate Requirement Levels. March 1997. Best Current Practice. URL: https://tools.ietf.org/html/rfc2119
[WEBAUDIO]
Paul Adenot; Raymond Toy. Web Audio API. URL: https://webaudio.github.io/web-audio-api/
[WebIDL]
Boris Zbarsky. Web IDL. URL: https://heycam.github.io/webidl/

Informative References

[CLOUD-GAMING-WEBCODECS]
Peter Thatcher. Cloud gaming with WebCodecs. 2018. Note. URL: https://www.w3.org/2018/12/games-workshop/slides/21-webtransport-webcodecs.pdf
[WEBCODECS-EXAMPLES]
Chris Cunningham. WebCodecs Examples. 2020. Note. URL: https://github.com/WICG/web-codecs/blob/master/explainer.md#examples
[WEBCODECS-USECASES]
Chris Cunningham. WebCodecs Use Cases. 2020. Note. URL: https://github.com/WICG/web-codecs/blob/master/explainer.md#key-use-cases

IDL Index

dictionary VideoFrameInit {
  unsigned long long timestamp;  // microseconds
  unsigned long long? duration;  // microseconds
};

[Exposed=(Window)]
interface VideoFrame {
    constructor(VideoFrameInit init, ImageBitmap source);
    void release();
    [NewObject] Promise<ImageBitmap> createImageBitmap(optional ImageBitmapOptions options = {});
    readonly attribute unsigned long long timestamp;  // microseconds
    readonly attribute unsigned long long? duration;  // microseconds
    readonly attribute unsigned long codedWidth;
    readonly attribute unsigned long codedHeight;

    readonly attribute unsigned long visibleWidth;
    readonly attribute unsigned long visibleHeight;
};

enum EncodedVideoChunkType {
  "key",
  "delta",
};
interface EncodedVideoChunk {
  constructor(EncodedVideoChunkType chunk_type, unsigned long long chunk_timestamp, BufferSource chunk_data);
  constructor(EncodedVideoChunkType chunk_type, unsigned long long chunk_timestamp, unsigned long long chunk_duration, BufferSource chunk_data);
  readonly attribute EncodedVideoChunkType type;
  readonly attribute unsigned long long timestamp;  // microseconds
  readonly attribute unsigned long long? duration;  // microseconds
  readonly attribute ArrayBuffer data;
};

callback WebCodecsErrorCallback = void (DOMException error);
callback VideoFrameOutputCallback = void (VideoFrame output);
callback VideoEncoderOutputCallback = void (EncodedVideoChunk chunk);

[Exposed=(Window)]
interface VideoDecoder {
    constructor(VideoDecoderInit init);

    Promise<void> configure(EncodedVideoConfig config);
    Promise<void> decode(EncodedVideoChunk chunk);
    Promise<void> flush();
    Promise<void> reset();

    readonly attribute long decodeQueueSize;
    readonly attribute long decodeProcessingCount;
};

dictionary VideoDecoderInit {
  VideoFrameOutputCallback output;
  WebCodecsErrorCallback error;
};

dictionary EncodedVideoConfig {
  required DOMString codec;
  BufferSource description;
  double sampleAspect;
};

dictionary VideoEncoderTuneOptions {
  unsigned long long bitrate;
  double framerate;
  required unsigned long width;
  required unsigned long height;
};

dictionary VideoEncoderInit {
  required DOMString codec;
  DOMString profile;
  required VideoEncoderTuneOptions tuneOptions;
  required VideoEncoderOutputCallback output;
  WebCodecsErrorCallback error;
};

dictionary VideoEncoderEncodeOptions {
  boolean? keyFrame;
};

[Exposed=(Window)]
interface VideoEncoder {
  constructor();
  Promise<void> configure(VideoEncoderInit init);
  Promise<void> encode(VideoFrame frame, optional VideoEncoderEncodeOptions options);
  Promise<void> tune(VideoEncoderTuneOptions options);
  Promise<void> flush();
  Promise<void> close();
};

Issues Index

Support VideoFrame creation from yuv data? See WICG#45
How to express encoded size vs. visible size vs. natural size WICG#26
[ttoivone] From the Javascript perspective, release is not needed but it could just clear all references to a VideoFrame object and let garbage collector to release the memory. Should guidelines be given here when the function needs to be called explicitly?