Jump to content

AV1: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Technology: PVQ is dropped, Daala EC is in, CDEF looking good
→‎Technology: More on CDEF & PVQ
Line 50: Line 50:
Prediction can happen for bigger units (≤128×128), and they can be subpartitioned in more ways. Predictions can be combined in more advanced ways (than a uniform average) in a block, including smooth and sharp gradients in different directions. This allows either inter–inter or inter–intra predictions to be combined in the same block.<ref name="VP10" /><ref name="VP10_1_year_in_presentation" />
Prediction can happen for bigger units (≤128×128), and they can be subpartitioned in more ways. Predictions can be combined in more advanced ways (than a uniform average) in a block, including smooth and sharp gradients in different directions. This allows either inter–inter or inter–intra predictions to be combined in the same block.<ref name="VP10" /><ref name="VP10_1_year_in_presentation" />


Two different non-binary [[arithmetic coding]] entropy coders were considered for replacing VP9's binary entropy coder: Daala's entropy coder (''Daala EC'') and [[Asymmetric Numeral Systems]]. The use of non-binary coding helps evade patents, but also adds bit-level parallelism to an otherwise serial process, reducing clock rate demands on hardware implementations.<ref name="tb_lca"/> Of the two contenders, ANS is the fastest to decode in software, but ''Daala EC'' is more hardware friendly.<ref name="tb_lca"/> As of late 2017, ''Daala EC'' has replaced VP9's entropy coder, with ANS still retained.
Two different non-binary [[arithmetic coding]] entropy coders were considered for replacing VP9's binary entropy coder: Daala's entropy coder (''Daala EC'') and [[Asymmetric Numeral Systems]]. The use of non-binary coding helps evade patents, but also adds bit-level parallelism to an otherwise serial process, reducing clock rate demands on hardware implementations.<ref name="tb_lca_slides"/> Of the two contenders, ANS is the fastest to decode in software, but ''Daala EC'' is more hardware friendly.<ref name="tb_lca"/> As of late 2017, ''Daala EC'' has replaced VP9's entropy coder, with ANS still retained.


The integration of Daala's [[Perceptual Vector Quantization]] proved too complex within the framework of AV1 and was dropped.
The integration of Daala's [[Perceptual Vector Quantization]] proved too complex within the framework of AV1, encoding-wise.<ref name="tb_lca_slides"/> The ''Rate Distortion'' heuristic framework aims to speed up the encoder by a sizable factor, PVQ or not,<ref name="tb_lca_slides"/> but PVQ was ultimately dropped.


For the in-loop filtering step, the integration of Thor's constrained low-pass filter and Daala's directional deringing filter has been fruitful: The combined ''Constrained Directional Enhancement Filter'' (CDEF) exceeds the results of using the original filters separately or together.<ref name="cdef"/><ref name="netvc98"/>
For the in-loop filtering step, the integration of Thor's constrained low-pass filter and Daala's directional deringing filter has been fruitful: The combined ''Constrained Directional Enhancement Filter'' (CDEF) exceeds the results of using the original filters separately or together.<ref name="cdef"/><ref name="netvc99"/>


[[File:VP9_tiles_720p.svg|thumb|upright=0.9|[[Parallel computing|Parallelism]] within a frame is possible in tiles (vertical) and tile rows (horizontal).]]
[[File:VP9_tiles_720p.svg|thumb|upright=0.9|[[Parallel computing|Parallelism]] within a frame is possible in tiles (vertical) and tile rows (horizontal).]]

Revision as of 22:49, 29 October 2017

AV1
Internet media typevideo/AV1, video/webm; codecs="av01.*"
Developed byAlliance for Open Media
Type of formatCompressed video
Contained byWebM
Extended fromVP9
Free format?Yes

AOMedia Video 1 (AV1) is an open, royalty-free video coding format designed for video transmissions over the Internet. It is being developed by the Alliance for Open Media (AOMedia), a consortium of leading firms from the semiconductor industry, video on demand providers, and web browser developers, founded in 2015. It is the primary contender for standardization by the video standard working group NetVC of the Internet Engineering Task Force (IETF).[1] The group has put together a list of criteria to be met by the new video standard.[2] It is meant to succeed its predecessor VP9 and compete with HEVC/H.265 from the Moving Picture Experts Group.[3]

AV1 can be used together with the audio format Opus in a future version of the WebM format for HTML5 web video and WebRTC.[4]

History

The first official announcement of the project came with the press release on the formation of the Alliance. The growing usage of its predecessor VP9 is attributed to confidence in the Alliance and (the development of) AV1 as well as the pricey and complicated licensing situation of MPEG's competitor HEVC.[5][6]

The roots of the project precede the Alliance, however. Individual contributors started experimental technology platforms years before: Xiph's/Mozilla's Daala already published code in 2010, VP10 was announced on 12 September 2014, and Cisco's Thor was published on 11 August 2015. The first version 0.1.0 of the AV1 reference codec was published on 7 April 2016.

The bitstream format is projected to be frozen in Q4[7][8] of 2017[needs update]. According to Mukund Srinivasan, chief business officer of AOM member Ittiam, early hardware support will be dominated by software running on non-CPU hardware (such as GPGPU, DSP or shader programs, as is the case with some VP9 hardware implementations), as fixed-function hardware will take 12–18 months after bitstream freeze until chips are available, plus 6 months for products based on those chips to hit the market.[8]

Purpose

The purpose of AV1 is to be as good as possible under royalty-free patent licensing. Crucial to this objective is therefore to ensure, during development, that it does not infringe on patents of competing companies. This contrasts to its main competitor HEVC, for which IPR review was not part of the standardization process.[5] The latter practice is stipulated in ITU-T's definition of an open standard. The case of HEVC's independent patent pools has been characterized by critical observers as a failure of price management.[9][10]

Under patent rules adopted from the World Wide Web Consortium (W3C), technology contributors license their AV1-connected patents to anyone, anywhere, anytime based on reciprocity, i.e. as long as the user does not engage in patent litigation.[11] As a defensive condition, anyone engaging in patent litigation loses the right to the patents of all patent holders.[5]

It aims for state of the art performance with a noticeable compression efficiency advantage at only slightly increased coding complexity. The efficiency goal is 25% improvement over HEVC.[2] AV1 is primarily intended for lossy encoding, although lossless compression is supported as well.[12]

It is specifically designed for real-time applications (especially WebRTC) and higher resolutions (wider color gamuts, higher frame rates, UHD) than typical usage scenarios of the current generation (H.264) of video formats where it is expected to achieve its biggest efficiency gains. It is therefore planned to support the color space from ITU-R Recommendation BT.2020 and 10 and 12 bits of precision per color component.[13]

Cisco is a manufacturer of videoconferencing equipment, and their Thor contributions aim at "reasonable compression at only moderate complexity".[10]

Technology

AV1 introduces "T-shaped" partitioning schemes for coding units, a feature from VP10

AV1 is a traditional block-based frequency transform format featuring new techniques taken from several experimental formats that have been testing technology for a next-generation format after HEVC and VP9.[14] Based on Google's experimental VP9 evolution project VP10,[15] AV1 incorporates additional techniques developed in Xiph's/Mozilla's Daala and Cisco's Thor.

AV1 performs internal processing in higher precision (10 or 12 bits per sample), which leads to compression improvement due to smaller rounding errors in reference imagery. For intra prediction, there are more (than 8) angles for directional prediction and weighted filters for per-pixel extrapolation. Temporal prediction can use more references. Prediction can happen for bigger units (≤128×128), and they can be subpartitioned in more ways. Predictions can be combined in more advanced ways (than a uniform average) in a block, including smooth and sharp gradients in different directions. This allows either inter–inter or inter–intra predictions to be combined in the same block.[16][17]

Two different non-binary arithmetic coding entropy coders were considered for replacing VP9's binary entropy coder: Daala's entropy coder (Daala EC) and Asymmetric Numeral Systems. The use of non-binary coding helps evade patents, but also adds bit-level parallelism to an otherwise serial process, reducing clock rate demands on hardware implementations.[6] Of the two contenders, ANS is the fastest to decode in software, but Daala EC is more hardware friendly.[5] As of late 2017, Daala EC has replaced VP9's entropy coder, with ANS still retained.

The integration of Daala's Perceptual Vector Quantization proved too complex within the framework of AV1, encoding-wise.[6] The Rate Distortion heuristic framework aims to speed up the encoder by a sizable factor, PVQ or not,[6] but PVQ was ultimately dropped.

For the in-loop filtering step, the integration of Thor's constrained low-pass filter and Daala's directional deringing filter has been fruitful: The combined Constrained Directional Enhancement Filter (CDEF) exceeds the results of using the original filters separately or together.[18][19]

Parallelism within a frame is possible in tiles (vertical) and tile rows (horizontal).

More encoder parallelism is possible thanks to configurable prediction dependency between tile rows.[20]

The Alliance publishes a reference implementation written in C and assembly language (aomenc, aomdec) as free software under the terms of the BSD 2-Clause License.[21]

Quality and efficiency

A first comparison from the beginning of June 2016[22] found AV1 on par with HEVC, as did one using code from late January 2017.[23]

As of April 2017, using the 8 currently enabled experimental features (of 77 total), Bitmovin was able to demonstrate favorable objective metrics, as well as visual results, compared to HEVC on the Sintel and Tears of Steel animated films.[24] A follow-up comparison by Jan Ozer of Streaming Media Magazine confirmed this, and concluded that "AV1 is at least as good as HEVC now."[25]

Ozer noted that his and Bitmovin's results contradicted a comparison by Fraunhofer Institute for Telecommunications from late 2016[26] that had found AV1 38.4% less efficient than HEVC, underperforming even AVC, and justified this discrepancy by having used encoding parameters endorsed by each encoder vendor, as well as having more features in the newer AV1 encoder.

Adoption

It is expected that Alliance members have interest in adopting the format, in respective ways, once the bitstream is frozen.[13][24] The member companies represent several industries, including browser vendors (Google, Mozilla, Microsoft), content providers (Google, Netflix, Amazon, Hulu) and hardware manufacturers (Intel, AMD, ARM, Nvidia).[5][6]

Video streaming service YouTube declared intent to transition to the new format as fast as possible, starting with highest resolutions within six months after the finalization of the bitstream format.[13]

Like its predecessor VP9, AV1 will be used together with the WebM and Opus formats. These are well supported among web browsers, with the exception of Safari (desktop and mobile versions) and the discontinued Internet Explorer (prior to Edge) (see VP9 in HTML5 video § browser support).

Coding tools

As of late October 2017, 32 of 101 experimental coding tools are enabled by default in the developmental software codebase.[27] In addition to current experiments, some have also been fully integrated by having their build-time flags removed.

The development process is such that coding tools are added as experiments in the codebase, controlled by build-time flags, for review by hardware and legal teams. Once reviews are passed, the experiment can be enabled by default.[8]

Experiment names are lowercased in the configure script and uppercased in conditional compilation flags.[28][29]

Former experiments that have been fully integrated

This list may or may not be complete

Historic build-time flag Explanation
alt_intra[30] A new prediction mode suitable for smooth regions[31]
cb4x4[32]
chroma_sub8x8[33]
delta_q[34]
daala_ec[35] The Daala entropy coder (a non-binary arithmetic coder)
ec_adapt[36] Adapts symbol probabilities on the fly.[31] As opposed to per frame, as in VP9.[5]
ec_smallmul[37] A hardware optimization of daala_ec[38]
ext_inter[39] Extended inter[20]
ext_refs[40] Adds more reference frames, as described in Adaptive multi-reference prediction using a symmetric framework[41]
filter_7bit[42] 7-bit interpolation filters[43]
palette[44]
rect_intra_pred[45]
ref_mv[46] Better methods for coding the motion vector predictors through implicit list of spatial and temporal neighbor MVs[31]
tile_groups[47]
var_tx[48]

Current experiments

Warped motion, as seen from the front of a train: From a pioneering slow-TV program featuring 7 hours of warped motion. The warped_motion and global_motion tools in AV1 aim to reduce redundant information in motion vectors by recognizing patterns arising from camera motion.
Enabled by default Build-time flag[27] Explanation
No adapt_scan
No amvr
No ans Asymmetric numeral systems: The other non-binary arithmetic entropy coder (that is faster in software but less hardware friendly)[5]
Yes aom_qm Quantization Matrices[49]
No bgsprite
Yes cdef Constrained Directional Enhancement Filter: The merge of Daala's directional deringing filter + Thor's constrained low pass filter[18][38]
No cdef_singlepass An optimization of cdef[19]
No cfl Chroma from Luma[31]
No coef_interleave
No colorspace_headers
No compound_round
Yes compound_segment
No compound_singleref
Yes convolve_round
No ctx1d
No daala_tx4 Daala Transforms[50][51]
No daala_tx8
No daala_tx16
No daala_tx32
No daala_tx64
No daala_tx Shorthand for daala_tx{4,8,16,32,64}[28]
No dct_only
No deblock_13tap
No dependent_horztiles
Yes dist_8x8 A merge of former experiments cdef_dist and daala_dist.[29] Daala_dist is Daala's distortion function.[6]
Yes dual_filter
No entropy_stats
No eob_first
Yes ext_comp_refs
Yes ext_delta_q
Yes ext_intra Extended intra[20]
Yes ext_partition
Yes ext_partition_types
No ext_partition_types_ab
No ext_skip
No ext_tile
Yes ext_tx
No ext_warped_motion
No filter_intra Interpolate the reference samples before prediction to reduce the impact of quantization noise[31]
No fp_mb_stats
No frame_marker
No frame_sign_bias
No frame_size
No frame_superres
Yes global_motion Global Motion[20][31]
No hash_me
No horzonly_frame_superres
Yes interintra Inter-intra prediction, part of wedge partitioned prediction[17]
No inter_stats_only
No intrabc
Yes intra_edge
No jnt_comp
No kf_ctx
No lgt
No lgt_from_pred
Yes loopfiltering_across_tiles
Yes loopfilter_level
Yes loop_restoration
No lpf_direct
No lpf_sb
No lv_map
No masked_tx
No max_tile
No mfmv
Yes motion_var Renamed from obmc.[52] Overlapped Block Motion Compensation: Reduce discontinuities at block edges using different motion vectors[31]
No mrc_tx
Yes mv_compress
No ncobmc
No ncobmc_adapt_weight
Yes new_multisymbol
No new_quant
No no_frame_context_signaling
No obu
Yes one_sided_compound
No opt_ref_mv
No palette_delta_encoding
Yes palette_throughput
Yes parallel_deblocking
No q_adapt_probs
No rawbits
No rd_debug
Yes rect_tx Rectangular transforms[53]
No rect_tx_ext
Yes reference_buffer
No ref_adapt
No segment_zeromv
Yes simple_bwd_adapt
Yes smooth_hv
No striped_loop_restoration
Yes tempmv_signaling
No tmv
No tpl_mv
No tx64x64
No txk_sel
Yes txmg
No unpoison_partition_ctx
No var_refs
No var_tx_no_tx_mode
Yes warped_motion Warped Motion[31]
Yes wedge Wedge partitioned prediction[17]
No xiphrc Xiph Rate Controller[54]

References

  1. ^ Rick Merritt (EE Times), 30 June 2016: Video Compression Feels a Pinch
  2. ^ a b Sebastian Grüner (19 July 2016). "Der nächste Videocodec soll 25 Prozent besser sein als H.265" (in German). golem.de. Retrieved 1 March 2017.
  3. ^ Zimmerman, Steven (15 May 2017). "Google's Royalty-Free Answer to HEVC: A Look at AV1 and the Future of Video Codecs". XDA Developers. Archived from the original on 14 June 2017. Retrieved 10 June 2017.
  4. ^ Tsahi Levent-Levi (3 September 2015). "WebRTC Codec Wars: Rebooted". BlogGeek.me. Retrieved 1 March 2017. The beginning of the end of HEVC/H.265 video codec
  5. ^ a b c d e f g Timothy B. Terriberry (18 January 2017). "Progress in the Alliance for Open Media" (video). linux.conf.au. Retrieved 1 March 2017.
  6. ^ a b c d e f Timothy B. Terriberry (18 January 2017). "Progress in the Alliance for Open Media (slides)" (PDF). Retrieved 22 June 2017.
  7. ^ https://fosdem.org/2017/schedule/event/om_av1/
  8. ^ a b c Ozer, Jan (30 August 2017). "AV1: A status update". Retrieved 14 September 2017.
  9. ^ "Standards are Failing the Streaming Industry". 4 May 2017. Retrieved 20 May 2017.
  10. ^ a b "Integrating Thor tools into the emerging AV1 codec" (PDF). 13 September 2017. Retrieved 2 October 2017. Royalty-free video codecs: The deployment of recent compression technologies such as HEVC/H.265 may have been delayed or restricted due to their licensing terms. (…) What can Thor add to VP9/AV1? Since Thor aims for reasonable compression at only moderate complexity, we considered features of Thor that could increase the compression efficiency of VP9 and/or reduce the computational complexity. {{cite web}}: Cite uses deprecated parameter |authors= (help)
  11. ^ Neil McAllister, 1 September 2015: Web giants gang up to take on MPEG LA, HEVC Advance with royalty-free streaming codec – Joining forces for cheap, fast 4K video
  12. ^ "examples/lossless_encoder.c - aom - Git at Google". aomedia.googlesource.com. Retrieved 29 October 2017.
  13. ^ a b c Ozer, Jan (3 June 2016). "What is AV1?". Streaming Media. Information Today, Inc. Archived from the original on 26 November 2016. Retrieved 26 November 2016. ... Once available, YouTube expects to transition to AV1 as quickly as possible, particularly for video configurations such as UHD, HDR, and high frame rate videos ... Based upon its experience with implementing VP9, YouTube estimates that they could start shipping AV1 streams within six months after the bitstream is finalized. ... {{cite web}}: Unknown parameter |dead-url= ignored (|url-status= suggested) (help)
  14. ^ Romain Bouqueau (12 June 2016). "A view on VP9 and AV1 part 1: specifications". GPAC Project on Advanced Content. Retrieved 1 March 2017.
  15. ^ Jan Ozer, 26 May 2016: What Is VP9?
  16. ^ Debargha Mukherjee, Hui Su, Jim Bankoski, Alex Converse, Jingning Han, Zoe Liu, Yaowu Xu (Google Inc.), International Society for Optics and Photonics, ed., "An overview of new video coding tools under consideration for VP10 – the successor to VP9", SPIE Optical Engineering+ Applications 9599, doi:10.1117/12.2191104 
  17. ^ a b c Converse, Alex (16 November 2015). "New video coding techniques under consideration for VP10 – the successor to VP9". YouTube. Retrieved 3 December 2016.
  18. ^ a b "Constrained Directional Enhancement Filter". 28 March 2017. Retrieved 15 September 2017.
  19. ^ a b "Thor update". July 2017. Retrieved 2 October 2017.
  20. ^ a b c d "Decoding the Buzz over AV1 Codec". 9 June 2017. Retrieved 22 June 2017.
  21. ^ https://aomedia.googlesource.com/aom/+/master/LICENSE
  22. ^ Sebastian Grüner (9 June 2016). "Freie Videocodecs teilweise besser als H.265" (in German). golem.de. Retrieved 1 March 2017.
  23. ^ "Results of Elecard's latest benchmarks of AV1 compared to HEVC". 24 April 2017. Retrieved 14 June 2017. The most intriguing result obtained after analysis of the data lies in the fact that the developed codec AV1 is currently equal in its performance with HEVC. The given streams are encoded with AV1 update of 2017.01.31
  24. ^ a b "Bitmovin Supports AV1 Encoding for VoD and Live and Joins the Alliance for Open Media". 18 April 2017. Retrieved 20 May 2017.
  25. ^ Ozer, Jan. "HEVC: Rating the contenders" (PDF). Streaming Learning Center. Retrieved 22 May 2017.
  26. ^ D. Grois, T, Nguyen, and D. Marpe, "Coding efficiency comparison of AV1/VP9, H.265/MPEG-HEVC, and H.264/MPEG-AVC encoders", IEEE Picture Coding Symposium (PCS) 2016 [1]
  27. ^ a b "AV1 experiment flags". 29 September 2017. Retrieved 2 October 2017.
  28. ^ a b Egge, Nathan (13 September 2017). "Add the DAALA_TX experiment". Retrieved 2 October 2017.
  29. ^ a b Cho, Yushin (30 August 2017). "Delete daala_dist and cdef-dist experiments in configure". Retrieved 2 October 2017. Since those two experiments have been merged into the dist-8x8 experiment
  30. ^ Joshi, Urvang (1 June 2017). "Remove ALT_INTRA flag". Retrieved 19 September 2017.
  31. ^ a b c d e f g h "Analysis of the emerging AOMedia AV1 video coding format for OTT use-cases" (PDF). Retrieved 19 September 2017.
  32. ^ Mukherjee, Debargha (21 October 2017). "Remove CONFIG_CB4X4 config options". Retrieved 29 October 2017.
  33. ^ Su, Hui (23 October 2017). "Remove experimental flag of chroma_sub8x8". Retrieved 29 October 2017.
  34. ^ Davies, Thomas (19 September 2017). "Remove delta_q experimental flag". Retrieved 2 October 2017.
  35. ^ Egge, Nathan (25 May 2017). "This patch forces DAALA_EC on by default and removes the dkbool coder". Retrieved 14 September 2017.
  36. ^ Egge, Nathan (18 June 2017). "Remove the EC_ADAPT experimental flags". Retrieved 23 September 2017.
  37. ^ Terriberry, Timothy (25 August 2017). "Remove the EC_SMALLMUL experimental flag". Retrieved 15 September 2017.
  38. ^ a b "NETVC Hackathon Results IETF 98 (Chicago)". Retrieved 15 September 2017.
  39. ^ Alaiwan, Sebastien (2 October 2017). "Remove compile guards for CONFIG_EXT_INTER". Retrieved 29 October 2017. This experiment has been adopted
  40. ^ Alaiwan, Sebastien (16 October 2017). "Remove compile guards for CONFIG_EXT_REFS". Retrieved 29 October 2017. This experiment has been adopted
  41. ^ Zoe Liu; Debargha Mukherjee; Wei-Ting Lin; Paul Wilkins; Jingning Han; Yaowu Xu (4 July 2017). "Adaptive Multi-Reference Prediction Using A Symmetric Framework". Retrieved 29 October 2017.
  42. ^ Davies, Thomas (19 September 2017). "Remove filter_7bit experimental flag". Retrieved 29 October 2017.
  43. ^ Fuldseth, Arild (26 August 2017). "7-bit interpolation filters". Retrieved 29 October 2017. Purpose: Reduce dynamic range of interpolation filter coefficients from 8 bits to 7 bits. Inner product for 8-bit input data can be stored in a 16-bit signed integer.
  44. ^ Joshi, Urvang (1 June 2017). "Remove PALETTE flag". Retrieved 19 September 2017.
  45. ^ Yoshi, Urvang (26 September 2017). "Remove rect_intra_pred experimental flag". Retrieved 2 October 2017.
  46. ^ Alaiwan, Sebastien (27 April 2017). "Merge ref-mv into codebase". Retrieved 23 September 2017.
  47. ^ Davies, Thomas (18 July 2017). "Remove the CONFIG_TILE_GROUPS experimental flag". Retrieved 19 September 2017.
  48. ^ Alaiwan, Sebastien (24 October 2017). "Remove compile guards for VAR_TX experiment". Retrieved 29 October 2017. This experiment has been adopted
  49. ^ Davies, Thomas (9 August 2017). "AOM_QM: enable by default". Retrieved 19 September 2017.
  50. ^ "Daala-TX" (PDF). 22 August 2017. Retrieved 26 September 2017. Replaces the existing AV1 TX with the lifting implementation from Daala. Daala TX is better in every way: ● Fewer multiplies ● Same shifts, quantizers for all transform sizes and depths ● Smaller intermediaries ● Low-bitdepth transforms wide enough for high-bitdepth ● Less hardware area ● Inherently lossless
  51. ^ Egge, Nathan (27 October 2017). "Daala Transforms in AV1".
  52. ^ Chen, Yue (13 October 2017). "Renamings for OBMC experiment". Retrieved 19 September 2017.
  53. ^ Mukherjee, Debargha (1 July 2016). "Rectangular transforms 4x8 & 8x4". Retrieved 14 September 2017.
  54. ^ Pehlivanov, Rostislav (15 February 2017). "Implement a new rate control system". Retrieved 19 September 2017. This commit implements a new rate control system which was ported from Daala's rate control system (which was based off of Theora's rate control system) (…) Bitrate targeting works much better than the current rate control system's targeting and will actually closely match the rate specified by the user without the current rate control system's bursty behaviour.