JCTVC-G Notes d9
JCTVC-G Notes d9
JCTVC-G Notes d9
Document: JCTVC-
of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11
G_Notes_d97
7th Meeting: Geneva, CH, 21-30 Nov 2011
Title: Meeting report of the seventh meeting of the Joint Collaborative Team on Video
Coding (JCT-VC), Geneva, CH, 21-30 Nov. 2011
Status: Report Document from Chairs of JCT-VC
Purpose: Report
Author(s) or Gary Sullivan
Contact(s): Microsoft Corp. Tel: +1 425 703 5308
1 Microsoft Way Email: garysull@microsoft.com
Redmond, WA 98052 USA
Jens-Rainer Ohm
Institute of Communications Engineering Tel: +49 241 80 27671
RWTH Aachen University Email: ohm@ient.rwth-aachen.de
Melatener Straße 23
D-52074 Aachen
Source: Chairs
_____________________________
Summary
[qq J. Boyce to coordinate BoG on AHG21]
The Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T WP3/16 and ISO/IEC
JTC 1/SC 29/WG 11 held its seventh meeting during 21-30 Nov 2011 at the ITU-T premises in Geneva,
CH. During the first two days of the meeting, rooms at the WMO headquarters were also used. The JCT-
VC meeting was held under the chairmanship of Dr. Gary Sullivan (Microsoft/USA) and Dr. Jens-Rainer
Ohm (RWTH Aachen/Germany). For rapid access to particular topics in this report, a subject
categorization is found in section 1.13 of this document.
The JCT-VC meeting sessions began at approximately 1100 hours on Monday 21 Nov 2011. Meeting
sessions were held on all days (including weekend days) until the meeting was closed at approximately
XXXX hours on Wednesday 30 Nov. Approximately XXX 284 people attended the JCT-VC meeting,
and approximately XXX 1000 input documents were discussed. The meeting took place in a co-located
fashion with a meeting of ITU-T SG16 – one of the two parent bodies of the JCT-VC. The subject matter
of the JCT-VC meeting activities consisted of work on the new next-generation video coding
standardization project now referred to as High Efficiency Video Coding (HEVC).
The primary goals of the meeting were to review the work that was performed in the interim period since
the sixth JCT-VC meeting in implementing the 4th HEVC Test Model (HM4) and editing the 4th HEVC
specification Working Draft (WD4), review the results from interim Core Experiments (CE), review
technical input documents, further develop Working Draft and HEVC Test Model (HM), and plan a new
set of Core Experiments (CEs) for further investigation of proposed technology.
The JCT-VC produced three particularly important output documents from the meeting: the HEVC Test
Model 5 (HM5), the HEVC specification Working Draft 5 (WD5), and a document specifying common
conditions and software reference configurations for HEVC coding experiments. Moreover, XX
documents describing the planning of future CEs were drafted.
For the organization and planning of its future work, the JCT-VC established XX "Ad Hoc Groups"
(AHGs) to progress the work on particular subject areas. The next four JCT-VC meetings are planned for
1–10 February 2012 under WG 11 auspices in San José, USA, 30 April 1–89 May 2012 under ITU-T
Administrative topics
1.1 Organization
The ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) is a group of video coding
experts from the ITU-T Study Group 16 Visual Coding Experts Group (VCEG) and the ISO/IEC JTC 1/
SC 29/ WG 11 Moving Picture Experts Group (MPEG). The parent bodies of the JCT-VC are ITU-T
WP3/16 and ISO/IEC JTC 1/SC 29/WG 11.
The Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T WP3/16 and ISO/IEC JTC 1/ SC 29/
WG 11 held its seventh meeting during 21-30 Nov 2011 at the ITU-T premises in Geneva, CH. The JCT-
VC meeting was held under the chairmanship of Dr. Gary Sullivan (Microsoft/USA) and Dr. Jens-Rainer
Ohm (RWTH Aachen/Germany).
Information regarding logistics arrangements for the meeting had been provided at
http://wftp3.itu.int/av-arch/jctvc-site/2011_11_G_Geneva/JCTVC-G_Logistics.doc.
1.4.1 General
The documents of the JCT-VC meeting are listed in Annex A of this report. The documents can be found
at http://phenix.it-sudparis.eu/jct/.
Registration timestamps, initial upload timestamps, and final upload timestamps are listed in Annex A of
this report.
Document registration and upload times and dates listed in Annex A and in headings for documents in
this report are in Paris/Geneva time. Dates mentioned for purposes of describing events at the meeting
(rather than as contribution registration and upload times) follow the local time at the meeting facility.
Decisions made by the group that affect the normative content of the draft standard are identified in this
report by prefixing the description of the decision with the string "Decision:". Decisions that affect the
reference software but have no normative effect on the text are marked by the string "Decision (SW):".
This meeting report is based primarily on notes taken by the chairs and projected for real-time review by
the participants during the meeting discussions. The preliminary notes were also circulated publicly by ftp
during the meeting on a daily basis. Considering the high workload of this meeting and the large number
of contributions, it should be understood by the reader that 1) some notes may appear in abbreviated form,
2) summaries of the content of contributions are often based on abstracts provided by contributing
proponents without an intent to imply endorsement of the views expressed therein, and 3) the depth of
discussion of the content of the various contributions in this report is not uniform. Generally, the report is
written to include as much discussion of the contributions and discussions as is feasible in the interest of
aiding study, although this approach may not result in the most polished output report.
1.5 Attendance
The list of participants in the JCT-VC meeting can be found in Annex B of this report.
The meeting was open to those qualified to participate either in ITU-T WP3/16 or ISO/IEC JTC 1/ SC 29/
WG 11 (including experts who had been personally invited by the Chairs as permitted by ITU-T or
ISO/IEC policies).
1.6 Agenda
The agenda for the meeting was as follows:
IPR policy reminder and declarations
Contribution document allocation
Reports of ad hoc group activities
Reports of Core Experiment activities
Review of results of previous meeting
Consideration of contributions and communications on HEVC project guidance
Consideration of HEVC technology proposal contributions
Consideration of information contributions
Coordination activities
Future planning: Determination of next steps, discussion of working methods, communication
practices, establishment of coordinated experiments, establishment of AHGs, meeting planning,
refinement of expected standardization timeline, other planning issues
Other business as appropriate for consideration
1.10 Terminology
Some terminology used in this report is explained below:
AHG: Ad hoc group.
AI: All-intra.
AIF: Adaptive interpolation filtering.
AIS: Adaptive intra smoothing.
ALF: Adaptive loop filter.
AMP: Asymmetric motion partitioning.
APS: Adapation parameter set.
AMVR: Adaptive motion vector resolution.
AVC: Advanced video coding – the video coding standard formally published as ITU-T
Recommendation H.264 and ISO/IEC 14496-10.
BA: Block adaptive.
BD: Bjøntegaard-delta – a method for measuring percentage bit rate savings at equal PSNR or
decibels of PSNR benefit at equal bit rate (e.g., as described in document VCEG-M33 of April
2001).
BoG: Break-out group.
BR: Bit rate.
BUDI: Bidirectional UDI.
CABAC: Context-adaptive binary arithmetic coding.
CBF: Coded block flag(s).
CE: Core experiment – a coordinated experiment conducted after the 3rd or 4th meeting.
DCT: Discrete cosine transform (sometimes used loosely to refer to other transforms with
conceptually similar characteristics).
DCTIF: DCT-derived interpolation filter.
DIF: Directional interpolation filter.
DF: Deblocking filter.
DT: Decoding time.
EPB: Emulation prevention byte (as in the emulation_prevention_byte syntax element).
2 AHG reports
The activities of ad hoc groups that had been established at the prior meeting are discussed in this section.
JCTVC-G001 JCT-VC AHG Report: Project Management (AHG 1) [G. J. Sullivan, J.-R.
Ohm (AHG chairs)]
This document reports on the work of the JCT-VC ad hoc group on Project Management.
The work of the JCT-VC overall has proceeded well in the interim period. A large amount of discussion
was carried out on the group email reflector. All report documents from the preceding meeting have been
made available at the ITU-based JCT-VC site (http://ftp3.itu.int/av-arch/jctvc-site/2011_07_F_Torino) or
the new "Phenix" site (http://phenix.it-sudparis.eu/jct/), particularly including the following:
The meeting report (JCTC-F800)
The HM 4 encoder description (JCTVC-F802)
The HEVC Working Draft (JCTVC-F803)
Common HM test conditions and software reference configurations (JCTVC-F900)
Page: 12 Date Saved: 2011-12-04
Finalized core experiment descriptions (JCTVC-F901 through JCTVC-F913)
Additional important current JCT-VC documents are noted as follows:
HEVC software guidelines (JCTVC-F688)
HEVC Reference Software Manual (JCTVC-F634)
The various ad hoc groups and tool experiments have made progress, and various reports from those
activities have been submitted.
Since the approval of software copyright header language at the March 2011 parent-body meetings, this
topic seems to be resolved.
No major news has been received regarding future meeting plans, etc.
No particular problems were noted with the produced outputs in this discussion.
JCTVC-G002 JCT-VC AHG report: HEVC Draft and Test Model editing (AHG 2) [B.
Bross, K. McCann, W.-J. Han, J.-R. Ohm, S. Sekiguchi, G. J. Sullivan, T.
Wiegand]
One draft of JCTVC-F802 and six drafts of JCTVC-F803 were published by the Editing AHG between
the 6th JCT-VC meeting in Torino (14-22 July, 2011) and the 7th Meeting in Geneva (21-30 November,
2011). JCTVC-F802 still needs significant further improvement, whilst the final draft of JCTVC-F803 is
reasonably complete.
The main changes in JCTVC-F803, relative to the previous JCTVC-E603, were listed. Some specific
open issues remaining for JCTVC-F803 were also noted.
NSQT integration was particularly difficult. A mismatch was reported between the software and the text
submitted by the proponents. It was noted that there are relevant input contributions to address this.
In the discussion, the importance of confirming the correctness of text when performing cross-checking of
proposals was emphasized.
CAVLC proposals not yet integrated into WD.
Tiles, wavefronts, and weighted prediction had not yet been integrated, not necessarily due to problems
with those aspects, but rather due to the scheduling of other activities that preceded it in integration order.
The work was prioritized to first integrate aspects likely to affect coding efficiency behaviour.
The general list of key HEVC issues that need to be addressed was identified to be:
Entropy coding architecture (see AHG9)
Transform and dynamic range (see AHG5 and AHG7)
Picture buffering and high-level syntax (see AHG21)
Picture resolution adaptation (see AHG18)
Non-4:2:0 colour formats (see AHG20)
10 bit vs. 8 bit decoding capability
Simplification of MV coding
In-loop filtering clean-up
Profiles and Levels
Parallel processing clean-up
NSQT: Deviation between text and software, more difficult to integrate than anticipated
CAVLC proposals not integrated yet
Tools from H4.1 are not yet included (not due to technical problems):
- Tiles and wavefront
- Weighted prediction
Encoder description should become mandatory
Cross-checkers of software should confirm that the WD text matches software (NSQT case). One issue
related to AMP (see CE9), where encoder behaviour was different than expected.
4.0 developed as planned, but 4.1 (particularly tiles and wavefront) more difficult to implement than
expected. (also loses approx. 0.5% due to overhead)
Concerns about quality of some delivered submissions
Any bug reports should be filed by ticket, not just verbally. Experts are encouraged/urged not only to
report bugs, but also to contribute fixing them.
JCTVC-G004 JCT-VC AHG report: Picture Partitioning and LCU scan order (AHG4) [R.
Sjöberg (AHG chair), Y. Chen, F. Henry, M. Horowitz, K. Kazui, A. Segall
(vice chairs)]
Main focus: Combination wavefront and tiles. No conclusion in reflector discussions. Could be a
profiling issue which combinations are allowed
Investigation on slice overhead (in HM 4.1) unveiled that nothing changed.
JCTVC-G005 JCT-VC AHG Report: Spatial Transforms (AHG 5) [P. Topiwala (AHG
Chair), M. Budagavi, R. Cohen, R. Joshi (vice chairs)]
The report should not list a “membership” of the AHG
Related CE7 / CE10
Problem with precision at low QP? To be further clarified
JCTVC-G006 JCT-VC AHG report: In-loop and post-processing filtering (AHG 6) [T.
Yamakage, K. Chono, Y. J. Chiu, I. S. Chong, M. Narroschke]
Recommends to study line buffer reduction jointly for all loop filters in a BoG
Recommends to discuss some CE related contributions (G211, G212, G656 and G691) in the context of
CE8 (also 499?)
Recommends to work on clean up of software and text
JCTVC-G007 JCT-VC AHG report: Transform dynamic range (AHG 7) [A. Segall
(Sharp), E. Alshina (Samsung)]
Results of email discussion:
Dynamic range restriction should be defined for the dequantized coefficients.
Dynamic range restriction should be defined after first inverse transform
Dynamic range restriction should not be defined after second inverse transform, if input to second
inverse transform is restricted to 16-bits
Dynamic range following the first transform can exceed 16-bit in the worst case
Clipping is preferred to restrict dynamic range for the dequantized coefficients and after first
inverse transform.
Recommendations:
JCTVC-G009 JCT-VC AHG report: Entropy Coding Architecture (AHG 9) [K. McCann
(chair), A. Fuldseth, D. Marpe, A. Segall, K. Sugimoto, V. Sze, W. Wan, X.
Wang (vice chairs)]
Switchable (2 operating points) vs. scalable (multiple operating points).
Recommendations:
HEVC should include only a single entropy coding technology with a single operating point
unless adding a second option provides a significantly different performance/complexity trade-
off which substantially facilitates the use of HEVC in a class of applications for which it would
not otherwise provide an appropriate solution
JCT-VC should analyse input contributions relating to entropy coding with the aim of making a
decision on the HEVC entropy coding architecture during the 7th JCT-VC meeting
Complexity (both hard and software) difficult to quantify.
JCTVC-G011 JCT-VC AHG Report: Video test material selection (AHG 11) [T. Suzuki]
Under-represented: High bit depth, 4:4:4.
Sufficient variety of noise conditions?
Compressed material in class E
No new test material
JCTVC-G013 JCT-VC AHG report: Screen Content Coding (AHG 13) [O. Au, J. Xu, H.
Yu (AHG chairs)]
Results with new screen content sequences with transform skipping (various input docs)
Is this AHG still needed?
JCTVC-G015 JCT-VC AHG report: High-level syntax (AHG 15) [Y. -K. Wang (chair), J.
Boyce, Y. Chen, M. M. Hannuksela, K. Kazui, T. Schierl, R. Sjöberg, T. K.
Tan, W. Wan (vice chairs)]
Include abstract & recommendations (no discussion)
JCTVC-G016 JCT-VC AHG report: Padding process (AHG 16) [V. Wahadaniah, K.
Chono, Y. Lin]
4 input docs (related to various aspects of padding in context of intra prediction)
JCTVC-G017 JCT-VC AHG Report: Scalable coding investigation (AHG 17) [J. Boyce, J.
Kang, K. Minoo, W. Wan, Y.-K. Wang]
Include abstract & recommendations (no discussion)
JCTVC-G018 JCT-VC AHG report: Resolution adaption (AHG 18) [T. Davies (AHG
chair), P. Topiwala, P. Wu (Vice-chairs)]
qq Potential synergy with scalability.
qq Usage for computational load management
Relation with scalability? Possibly
How to measure? PSNR is not useful across resolutions
Resolution adaptation is not only about better subjective quality, but also complexity adjustment.
JCTVC-G019 JCT-VC AHG Report: Transform Skipping (AHG19) [M. Mrak (AHG
chair), J. Sole, I.-K. Kim, J. Xu, H. Yu (vice chairs)]
The recommendations of the AHG are
To study feasibility and effectiveness of integrated transform skipping - related proposals
JCTVC-G020 JCT-VC AHG report: Chroma format support (AHG 20) [David Flynn,
Dzung Hoang, Ken McCann]
Not much activity. Input doc (G967, G862) on the topic - discuss in breakout (D. Flynn).
JCTVC-G021 JCT-VC AHG report: Reference picture buffering and list construction
(AHG21) [D. Flynn, R. Sjöberg (AHG chairs), Y. Chen, T.K. Tan, W. Wan,
Y.-K. Wang (vice chairs)]
Open issues: long-term pictures, filling of reference picture lists, CRA issue.
qq Question re PPS versus APS usage – intent is to avoid sending the RPS in the slice header – wasn't
sure of what APS would ultimately be.
qq Question re detection of lost pictures
BoG [YKW & RS] Later, J. Boyce.
WD text and software were developed and agreed (via reflector). Only loss of complete pictures is
supported.
Addressing of long-term pictures, construction of list and pictures following a CRA (can they reference
pictures before the CRA) are open issues.
Discussion: should the RPL information be in PPS? Or rather APS?
Lambda was adjusted – this could be of concern when used in CEs.
Source code not fully aligned with WD text
15 related input contributions (most build on top of the AHG WD text)
AHG recommendations:
JCT-VC to review the candidate WD text on picture buffer management and consider it for adoption
To use the HM-4.0-dev-ahg21-picbuffer source code for comparisons in picture buffer management
proposals
JCT-VC to review all picture buffer management and list construction related input document.
BO: R. Sjoberg, YK Wang Later, J. Boyce
JCTVC-G022 JCT-VC AHG report: Lossless Coding (AHG22) [W. Gao (chair), K. Chono,
J. Xu, M. Zhou (vice chairs)]
4 contributions on lossless coding (092, 093, 268, 664) – CE? Harmonization?
Locally lossless mode should also be considered (LCU level)
4.1.1 Summary
JCTVC-G031 CE1: Summary report of core experiment on entropy coding [R. Joshi, E.
Alshina, H. Sasai, H. Kirchhoffer, J. Lainema (CE coordinators)]
Subtest A
a) Delayed probability update (576, 349)
Proponent Description BD-rate BD-rate, BD-rate,
(Y) (U) (V)
Qualcomm Delay 1 bin (all syntax elements) 0,1% 0,1% 0,2%
JCTVC- Delay 2 bins (all syntax elements) 0,6% 0,4% 0,5%
G576 Delay 3 bins (all syntax elements) 1,0% 0,4% 0,7%
Panasonic Delay all coefficient coding parameters until end of 0,2% 0,1% 0,1%
JCTVC- block
G349 Delay all coefficient coding parameters except for 0,1% -0,1% -0,1%
significant_coeff_flag" parameters until end of block
Delay "last_significant_coeff_x" and 0,0% -0,1% -0,1%
"last_significant_coeff_y" parameters until end of block
Delay "coeff_abs_level_greater1_flag"and 0,1% 0,0% 0,0%
"coeff_abs_level_greater2_flag" parameters until end of
block
Delay "significant_coeff_flag" parameters until end of 0,2% 0,1% 0,2%
block
Comment by one expert: It should be observed if the delayed update affects the probability
model and estimation (may be implementation specific and not be critical for the 1 bin delay
case).
Another comment: In hardware, delaying the update may not help to increase the throughput.
Particularly, delaying more than one bin produces unacceptable losses.
Revisited after presentation of other contributions that target increase of CABAC throughput.
For the one bin delay case, some doubt is expressed by other experts that it would help
increasing the throughput. G349 (updating probabilities of some syntax elements at the end of
TC block) is an even less systematic approach and decrease the regularity. No action.
b) Line buffer reduction (200, 769)
Proponent Description BD-rate, Y BD-rate, U BD-rate, V
MediaTek Split LCU
0,0% 0,0% -0,1%
JCTVC-G200
Skip LCU
0,0% 0,0% 0,0%
Spli&Skip LCU
0,0% 0,0% -0,1%
Samsung Split CU
JCTVC-G769 0,1% 0,0% 0,0%
Note: G200 uses 3 additional context models (6 instead of 3). In original contribution F060
Side activity to suggest common solution (proponents of G200, G769 and V. Sze and T.Nguyen)
Subtest B (764)
Summary of tests results (single parameter probability update model):
All Intra HE Random Access HE Low delay B HE
Y U V Y U V Y U V
Class A -0.8% 0.0% 0.0% -0.8% 0.8% 1.5%
Class B -0.6% -0.7% -0.6% -0.5% -0.2% 0.0% -0.4% 0.2% 0.0%
Class C -0.6% -0.9% -0.8% -0.4% -0.6% -0.3% -0.4% -0.3% -0.5%
Class D -0.7% -1.5% -1.5% -0.4% -0.6% -1.1% -0.4% -0.4% -0.6%
Overall -0.7% -0.8% -0.8% -0.6% -0.2% 0.0% -0.3% -0.4% -0.4%
Class B -0.9% -1.1% -1.2% -1.0% -0.7% -0.7% -0.8% -0.8% -1.1%
Class C -0.9% -1.2% -1.2% -0.9% -0.9% -0.9% -0.8% -0.7% -1.2%
Class D -0.9% -1.7% -1.7% -0.9% -1.1% -1.3% -0.8% -1.0% -1.0%
Note: G326, G413, G547 propose similar approaches but most likely in better implementation.
Subset C
Test results summary (BD-rate difference is averaged across AI, RA, and LD, class F is not included):
Description HE LC
Test Proponent BD-rate, BD-rate, BD-rate, BD-rate,
(Y)
(Y) (U+V)/2 (U+V)/2
BAC, -5.99% +4.47%
LC configuration
1 HHI BAC, LCmod, -0.45% +8.78%
(JCTVC- 8-bit init
G633)
2 HHI V2V, LCmod, Multi- -0.45% +7.80%
(JCTVC- bin,
G633) 8-bit init, TBC
3 HHI V2V, LCmod, Multi- -0.14% +8.13%
(JCTVC- bin,
G633) Low delay, 8-bit init,
TBC
4 HHI V2V, 8-bit init, +0.14% -0.74% -5.85% +3.78%
(JCTVC- TBC
G633)
5 HHI V2V, LowDelay, 8-bit 0.32% -0.58%
(JCTVC- init, TBC
G633)
6 HHI BAC, -0.24% -0.18% -6.20% 4.45%
(JCTVC- 8-bit init
G633)
7 HHI BAC, 8-bit init, -0.79% -0.58%
(JCTVC- Alt. PMU, TBC
G633)
8 Mitsubishi V2F, -3.27% 7.44%
(JCTVC- 8-bit init
G458)
9 Mitsubishi V2F, LowDelay, -3.27% 7.44%
(JCTVC- 8-bit init
G458)
10 Mitsubishi V2F, MC mod., -3.04% 9.48%
(JCTVC- LowDelay, 8-bit init
G458)
11 Samsung/ V2V, 8-bit init,
HHI Alt. PMU, TBC
(JCTVC-
G771)
12 Cisco BAC, RDOQ off, -5.20% -0.16%
(JCTVC- RDO PMU off
G210)
13 Cisco BAC, RDOQ off 5.99% 4.63%
Test V2F V2V BAC LC MC PMU Multi Low 8-bit Alt. RDOQ RDO TBC Config.
mod. mod. off bin delay Init PMU off PMU off tested
1 x x n/a x LC
2 x x x x x LC
3 x x x x x x LC
4 x x x HE, LC
5 x x x x HE
6 x n/a x HE, LC
7 x n/a x x x HE
8 x x LC
9 x x x LC
10 x x x x LC
11 x x x x HE
12 x x x LC
13 x x x LC
14 x x n/a x x LC
15 x x x x x x LC
16 x x x x x LC
17 x x HE,
LC
Conclusions of subtest C:
- No further consideration on V2V and V2F (see below)
- One entropy coder? (looking at tests 12 and 13): Whereas 12 is encoder only and loses only
slightly compared to the test 1 case, 13 also changes the decoder.
Two experts mention that runtime is not the only issue, also consider throughput which may be
problem with CABAC (G569 addresses this issue by having “not two but 1.2” entropy coders).
4.1.2 Contributions
Subtest A
JCTVC-G200 CE1.A.3: Reducing line buffers for CABAC [T.-D. Chuang, C.-Y. Chen, Y.-
W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE1.A.3. In HM-4.0, split_coding_unit_flag and skip_flag are the
only two CABAC syntax elements that still have dependency on upper LCUs and need line buffers. In
this contribution, it is proposed to use the depth information of the current block and the left block for the
context formation of split_coding_unit_flag and to use the data of the left block for the context formation
of skip_flag when the upper block belongs to the upper LCU. In this way, all CABAC line buffers can be
removed. It is reported that the proposed context modeling causes less than 0.08% bit rate increase.
JCTVC-G349 CE1: SubsetA: Parallel context processing for coefficient coding using block-
based context updates [H. Sasai, T. Nishi (Panasonic)]
This contribution is a test report for JCTVC-E226 listed in CE1 Subset A. Proposed technique is aimed to
improve the throughput of the entropy coder for CABAC. The context updates make difficult to increase
throughput due to their serial dependencies. The proposed modifications have been implemented in
HMv4 and their coding efficiencies were evaluated for coefficient cording parameters
"last_significant_coeff_x","last_significant_coeff_y","significant_coeff_flag","coeff_abs_level_greater1_
flag"and"coeff_abs_level_greater2_flag" respectively. The parallelization capability by the proposal
comes at a cost of less than 0.1% performance loss.
JCTVC-G576 CE1: Delayed state update for CABAC [R. Joshi, J. Sole, M. Karczewicz
(Qualcomm)]
Context update is a known bottleneck in hardware implementation of a CABAC decoder. This is because
the state of the context is updated based on the decoded bin. Previous efforts to mitigate this problem
have included delaying the context update by one during transform coefficient coding and delaying the
update till the end of a TU. In this proposal we extend the state update delay to bins from all syntax
elements. Results are presented for update delays of 1, 2 and 3 bins. For update delays of 1, 2, and 3 bins,
average BD-rates of 0.1%, 0.6% and 0.9%, respectively, are reported for HE configurations.
JCTVC-G472 CE1.A.4: Crosscheck for Samsung's line buffer removal for CU split flag
context model in JCTVC-G769 [T.-D. Chuang, Y.-W. Huang (MediaTek)]
JCTVC-G763 CE1: Table-based bit estimation for CABAC [F. Bossen (DOCOMO
Innovations)]
In the RDO process of the HM, a block may be encoded multiple times using different modes before a
best mode is selected based on a rate-distortion criterion. When using CABAC, each of these encodings
use the CABAC engine itself to count a number of bits. In this experiment the bit counting procedure is
simplified wherein bit counts are estimated using tables. This simplification does not impact rate-
distortion performance (all recorded luma BD-rate averages are 0.0%) while reducing the encoding time
by 1 to 5%.
Non-normative improvement – Decision (SW): Adopt.
Subtest B
Subtest C
JCTVC-G210 CE1: Subtest 12, Entropy coding comparisons with simplified RDO [T.
Davies (Cisco)]
Various coding conditions are simulated with both CAVLC and CABAC. Under low-complexity (LC)
common conditions it is reported that CABAC provides gain between 5.4% and 6.2% (6.8% including
class F). The performance of the entropy coders is also investigated with two encoder restrictions: no
RDOQ, and no adaption during RDO mode search. These assumptions together reduce the gap between
CABAC and CAVLC by 0.775% (0.875% including class F) averaged across LC settings. When RDOQ
is off and RDO adaption is off the gap is reported to be between 5.0% and 5.5% for LC settings (4.9%
and 5.9% including class F).
JCTVC-G633 CE1: Report of test results related to PIPE-based Unified Entropy Coding
[Heiner Kirchhoffer, Benjamin Bross, Anastasia Henkel, Detlev Marpe,
Tung Nguyen, Matthias Preiß, Mischa Siekmann, Jan Stegemann, Thomas
Wiegand (Fraunhofer HHI)]
This contribution reports results for tests related to the unified PIPE-based entropy coding using v2v
codes in CE1. Various combinations of the tools PIPE/v2v, BAC, LC modeling, low delay, 8-bit init, and
table-based bit counting were analyzed to identify the influence of the tools on BD rate and codec
runtime. Furthermore, hardware implementation aspects are analyzed for PIPE/v2v with low delay
constraint.
New concept of “chunk interleaving” is presented (not in any proposal before) which solves some of the
multiplexing and low delay issues (for pre-defined number of parallel encoders/decoders).
No analysis about concrete statistics of bins that can be processed in parallel. Main limitation may come
from the parser.
JCTVC-G753 CE1: Crosscheck of CE1 subtest C.6 (JCTVC-G633) proposed by HHI [X.
Zheng (HiSilicon)] [late]
JCTVC-G932 CE1: Cross-check for bug-fix version of test 12 in subset C from HHI
(JCTVC-G633) by Samsung [E. Alshina, J.H. Park (Samsung)] [late]
JCTVC-G300 CE1: Cross-check report for CE1 Subset C Test 7 [H. Sasai, T. Nishi
(Panasonic)]
JCTVC-G641 CE1: Cross-check of Subtest C (test 1 and test 11) and Subtest B - multi-
parameter probability update for CABAC (test 11) [Jinwen Zan, Dake He]
[late]
JCTVC-G771 CE1 (subset C, test 11): Multi-parameter probability up-date for PIPE [A.
Alshin, E. Alshina, J.H. Park (Samsung), H. Kirchhoffer (HHI)]
In this contribution, an multi-parameter probability up-date technique is proposed for relatively new
entropy coding scheme PIPE. It should be noted that probability up-date for PIPE coincides with CABAC
probability up-date. Proposed method allows more precise probability estimation for current bin which
means that more accurate distribution between different bin-encoders becomes possible. In terms of
coding efficiency, the presented probability up-date technique on top of PIPE shows an average BD rate
gain about 0.4 % in HE configuration.
4.2.1 Summary
4.2.2 Contributions
JCTVC-G517 CE2 subtest C.1: Harmonization of unified scan and NSQT [X. Zheng
(HiSilicon), Y. Yuan, Y. He (Tsinghua)]
This document provides a harmonization solution of unified scan and non-square quadtree transform
(NSQT). At the proposed solution, non-square to square reordering process for transform coefficient
coding is removed. The experimental results show that no coding lost under common test condition.
Encoding and decoding time for the proposed solution is almost same as HM4.0 anchor.
Consists of two parts: Harmonization of scans and solution for the divergence between SW and WD.
See also transform coefficient coding section.
JCTVC-G518 CE2 subtest C.1: Non-square quadtree (NSQT) with 2x8 and 8x2 transform
[Y. Yuan (Tsinghua), X. Zheng (HiSilicon), Y. He (Tsinghua)]
This document provides the results of non-square quadtree (NSQT) with 2x8 and 8x2 transform. At the
proposed method, a 2x2 Hadamard-like transform is added to HEVC framework. The tests under
common test condition show that the average gain of 0.0% for RA, 0.0% for RA_LC, 0.1% for LD_B,
0.1% for LD_B_LC, 0.1% for LD_P and 0.2% for LD_P_LC can be achieved. Combine with non-square
hadamard transform, the average gain of 0.1% for RA, 0.2% for RA_LC, 0.3% for LD_B, 0.4% for
LD_B_LC, 0.3% for LD_P and 0.4% for LD_P_LC can be achieved. Compare to HM4.0 anchor,
encoding and decoding time are almost same as before.
2x8 and 8x2 were rejected last time, could be problematic in terms of memory access. Gain is relatively
small (0.1% without class F). G521 is an improvement by the proponents.
JCTVC-G749 CE2: Overlapped Block Motion Compensation [L. Guo, I.S. Chong, X.
Wang, M. Karczewicz (Qualcomm)]
In this contribution, overlapped block motion compensation (OBMC) has been implemented and tested
on HM 4.0. OBMC is applied to 2NxN, Nx2N and AMP motion partitions. To limit the worst case
memory bandwidth of OBMC, the fetching of extra pixels is disabled for bi-prediction PUs in 8x8 CU.
The method achieved a BD-rate reduction of 0.6%, 0.9% and 2.0% on average for RA-HE, LD-HE and
LDP-HE respectively. For LC test, the average BD-rate reduction 0.6%, 0.8% and 2.0% for RA-LC, LD-
LC and LDP-LC respectively.
Additional memory access may not be an issue (also confirmed by the cross-checkers), but averaging
operation reduces computational throughput and adds complexity at decoder.
One expert mentions that a disabling flag should be implemented.
Three companies express negative opinions. No support by other companies. Consider discontinuation.
4.3.1 Summary
The goal of this Core Experiment (CE) is to further investigate following aspects of motion compensation
in HM:
Simplification of interpolation MC and reduction of reference frame memory access bandwidth;
Improve the trade-off between coding performance and complexity by MC optimization;
Study complexity in terms for computations number and memory band-width with actual hit-ratio
measurement ;
4.3.2 Contributions
JCTVC-G395 CE3: Cross check for eBrisk's proposal JCTVC-G057 (tool 6) [K. Kondo, T.
Suzuki (Sony)]
JCTVC-G058 CE3: Interpolation using different-length horizontal and vertical filters [F.
Kossentini, N. Mahdi, H. Guermazi, M. Horowitz (eBrisk Video Inc.)]
In this contribution, an interpolation filtering technique is proposed for the motion compensation
interpolation filtering of the luminance and chrominance samples of video sequences. This proposal
consists of using one set of fixed vertical filters (V_F) for the vertical stage of filtering and a second set of
fixed horizontal filters (H_F) for horizontal stage of filtering. Compared to HM4.0, the proposed
technique yields average BD-rate reductions of -0.7% for LDP/LC, -0.2% for LDP/HE, -0.1% for LD/LC,
0.0% for LD/HE, 0.2% for RA/LC and 0.0% for RA/HE, while decreasing the decoding complexity. In
fact, this technique reduces the required number of multiplications and additions by 5% and 6%,
respectively.
JCTVC-G277 CE3: Progressive Motion Vector Resolution [J. An, X. Li, X. Guo, S. Lei
(MediaTek)]
In JCTVC-F125, a progressive MV resolution (PMVR) method was proposed, which uses higher MV
resolution near to MV predictor (MVP) and lower MV resolution far from MVP. Thresholds for 1/4- and
1/8-pixel resolution were used to indicate the range of corresponding MV resolutions. This contribution
presents the results of PMVR method on top of HM4.0. For PMVR without 1/8-pixel resolution, it is
reported that by using different thresholds, average BD-Rate reduction of 0.2% can be achieved with
around 9% encoding time decrease for RA and LB cases, and average BD-Rate reduction of 0.1% can be
achieved with around 4% encoding time decrease for LP case. For PMVR with 1/8-pixel resolution, it is
reported that by using different thresholds, average BD-Rate reduction of 0.5% with around 8% encoding
time decrease can be achieved for RA and LB cases, and average BD-Rate reduction of 2.6% with 5%
encoding time increase can be achieved for LP case.
JCTVC-G391 CE3: Tap length reduction for small block (tool 8) [K. Kondo, T. Suzuki
(Sony), K. Ugur (Nokia)]
This contribution reports results of MC boundary filter (MBF) which is studied in core experiment (CE)
3. To reduce complexity both memory bandwidth (b/w) and computation, the MBF uses different filter
coefficients to MC block boundary. With the proposed method, the memory b/w can be reduced -44%
and -17% for worst and actual in common test condition. The computation of multiplications can be
reduced -28% and -9% for worst and actual case. The impact for coding efficiency are 0.3%, 0.2%, 0.3%,
0.1%, 0.3%, and 0.2% for RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC. The combination
MBF with a restriction of PU 8x4 and 4x8 bi-prediction is additionally tested. The worst memory b/w can
be reduced -58%. The impact for coding efficiency are 0.4%, 0.4%, 0.6%, 0.4%, 0.1%, and 0.1% for
RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC.
JCTVC-G393 CE3: Cross check for Toshiba's proposal JCTVC-G427 (tool 2) [K. Kondo,
T. Suzuki (Sony)]
JCTVC-G696 CE3: Fixed interpolation filter tests by Motorola Mobility [J. Lou, K. Minoo,
D. Baylon, L. Wang, A. Luthra (Motorola Mobility)]
This document reports the results of Motorola Mobility’s interpolation filters for HEVC. The simulations
were conducted using HM4.0 software with Motorola Mobility’s modifications. Four sets of fixed
interpolation filters are tested. Compared with the current interpolation filter in HM4.0, the proposed 6-
tap half-pel filter with 7-tap quarter-pel filter scheme with 13/64 offset achieves 0.0%, 0.3%, 0.1%, 0.2%,
-0.3% and -0.4% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and LPLC settings; the
proposed 6-tap half-pel filter with 7-tap quarter-pel filter scheme with 3/16 offset achieves 0.4%, 0.9%,
0.3%, 0.5%, -0.4% and -1.2% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and LPLC
settings; the proposed 6-tap half-pel filter with 7-tap quarter-pel filter scheme with 15/64 offset achieves -
0.1%, 0.0%, 0.1%, 0.2%, 0.3% and -0.3% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and
LPLC settings; the proposed 8-tap half-pel filter with 8-tap quarter-pel filter scheme with 3/16 offset
achieves -0.1%, -0.3%, 0.1%, -0.4%, -0.8% and -0.3% bitrate differences in RAHE, RALC, LBHE,
LBLC, LPHE and LPLC settings. Cross-check will be provided by Samsung. The attached spreadsheet
contains detailed data of the results.
JCTVC-G697 CE3: Joint sub-pixel interpolation filter tests for bi-predicted motion
compensation by Motorola Mobility [J. Lou, K. Minoo (Motorola Mobility)]
This document reports the results of Motorola Mobility’s Joint Sub-Pixel Interpolation Filters (JSPIF) for
bi-predicted motion compensation for HEVC. The simulations were conducted using HM4.0 software
with Motorola Mobility’s modifications. Three sets of 6-tap fixed interpolation filters for bi-predicted
motion compensation are used. For uni-prediction, 8H+7Q fixed filters with 3/16 offset are used.
Compared with the current interpolation filter in HM4.0, set0 without bug-fix achieves -0.3%, -0.3%, -
0.8% and -0.8% bitrate differences in RAHE, RALC, LBHE and LBLC settings; set1 without bug-fix
achieves -0.2%, -0.2%, -0.7% and -0.7% bitrate differences in RAHE, RALC, LBHE and LBLC settings;
set2 without bug-fix achieves -0.4%, -0.4%, -0.8% and -0.7% bitrate differences in RAHE, RALC, LBHE
and LBLC settings; set2 with bug-fix achieves -0.4%, -0.3%, -0.8% and -0.7% bitrate differences in
RAHE, RALC, LBHE and LBLC settings; All the three sets achieve -0.8% and -1.3% bitrate differences
in LPHE and LPLC settings.
Was presented.
Some concern about additional complexity particularly for hardware. Depending on implementation, cost
could be roughly 2x chip size for MC, depending on implementation.
No support except by proponents.
JCTVC-G775 CE3: 7Q6H taps interpolation filters test by Samsung [E. Alshina, A. Alshin,
J.H. Park (Samsumg)]
This is CE response from Samsung. The following combination: 7 taps quarter-pel and 6 taps half-pel
interpolation filter was tested. In average across 6 test cases RA-HE/LC, LD-HE/LC and LD(P) –HE/LC
this combination provides 0.07%(Y) -0.11% (U) -0.19% (V) BD-rate change (drop in Luma and gain for
Chroma). Computation complexity of MC for this filter approaches to AVC interpolation filter which is
13% and 16% less in terms of number of mults and adds compare to HM4.0
JCTVC-G394 CE3: Cross check for Samsung's proposal (tool 1) [K. Kondo, T. Suzuki
(Sony)]
JCTVC-G778 CE3: 7 taps interpolation filters for quarter pel position MC from Samsung
and Motorola Mobility [E. Alshina, A. Alshin, J.-H. Park, (Samsung), J. Lou,
K. Minoo, (Motorola Mobility)]
Two variants of 7 tap interpolation filters for quarter-pel position are tested here. Both resolves visual
artifacts problem in LD(P)-LC, SAO off test. Both show performance improvement compare to HM4.0.
An average BD-rate for Y/U/V components across 6 test cases required in CE3: -0.1%/ -0.1%/-0.2% for
variant A and -0.4%/0.0%/0.0% for variant B. Two variants of proposed 7 taps filters use different phase
shift (1/4 for variant A and 3/16 for variant B) which results in the same worst case computational
complexity and memory access while different hit-ration for fractional position in MC process and so
different statistical computation complexity was observed. For variant A the number of computations
according to CE3 measure is 5-6% smaller compare to HM4.0. Variant B shows 1-2% higher
computational complexity compare to HM4.0
Was presented. Gain for LD P only: 1/4 HE 0.1 LC 0.5; 3/16 HE 0.8 LC 1.3.
It needs to be clarified whether there is visual advantage for ALF/SAO off and LD P, and no disadvantage
for other cases. It was suggested to oOrganize a viewing session and revisit discuss the subject further
after that
4.4.1 Summary
Comment: QP scaling is being done at a finer level. Will this be required if quantization matrices are
used? Flat quantization matrices can achieve QP scaling at a finer level.
Comment: Modified version of JCTVC-G382 was cross-checked in JCTVC-G1045 (RDO-Q On BD-Rate
matches, code was studied). Modified version of JCTVC-G850 was cross-checked in JCTVC-G1040
(BD-Rate match was observed, code was studied).
Comments: JCTVC-G403 cross-checker commented updated version of their document asserts that all of
the gain reported in JCTVC-G382 could be achieved by non-normative modifications only.
Comments: First test: Second bit-allocation in cross-check document JCTVC-G403 asserted to be not
constrained within 2%. CE submissions were reported to be within 2%.Second test: Encoder only test
reports more gain with two pass algorithm.
Comment: Concerns expressed with regards to complexity at slice level. In worst case bitstreams,there
could be many slices in bitstreams.
Suggestions: Study RDOQ off case. What is the complexity impact? On decoder side and on encoder
side. RDO-On case:
For further study in CE.
JCTVC-G773:
Comments: Asserted that dQP rate could increase when QP starts changing.
Comments: Introduces functionality that allows for QP to change at a finer scale.
Comments: Asserted G773 could be implemented using G850.
Comments: G773 was tested with perceptual quantization and not bit-rate control.
Comments: Provides new functionality but functionality is not proven.
Comments: In some implementations, this could lead to doubling of quant matrix tables.
For further study in CE.
4.4.2 Contributions
JCTVC-G721 CE4 Subtest 1.1.a: QP adaptation at sub-CU level [Xue Fang, Jae Hoon
Kim, Krit Panusopone, Limin Wang (Motorola Mobility)]
JCTVC-G070 CE4 Subset 1.2.b: Ericsson's table-based delta QP coding method [R.
Sjoberg, J. Sun (Ericsson)]
JCTVC-G773 CE4 Subtest 1.2.c: Higher granularity of quantization parameter scaling [T.
Lee, J. Chen, J. H. Park (Samsung), K. Chono (NEC)]
JCTVC-G363 CE4: Improvement of delta-QP coding (1.3.a) [J. Xu, K. Kondo, K. Sato, A.
Tabatabai (Sony)]
JCTVC-G066 CE4 Subtest 1: Spatial QP prediction based on intra prediction (test 1.3.b)
[H. Aoki, K. Chono (NEC), M. Kobayashi, M. Shima (Canon)]
JCTVC-G460 CE4 subset 1.3.b: crosscheck of QP prediction based on intra prediction [K.
Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)]
JCTVC-G067 CE4 Subtest1: test 1.3.c [H. Aoki, K. Chono (NEC), M. Kobayashi, M.
Shima (Canon), K. Sato (Sony)]
JCTVC-G728 CE4 Subtest 1: Spatial QP prediction (test 1.3.e): combination of test 1.3.b
and test 1.3.d [M. Coban, M. Karczewicz (Qualcomm)]
JCTVC-G073 CE4: Cross-check of Subtest 1.3.e - Spatial QP prediction [C. Yeo (I2R)]
JCTVC-G068 CE4 Subtest1: QP prediction based on intra/inter prediction (test 1.4) [H.
Aoki, K. Chono (NEC), M. Coban, M. Karczewicz (Qualcomm)]
JCTVC-G074 CE4: Cross-check of Subtest 2 on de-quantization offset (2.1.a, 2.3.d) [C. Yeo
(I2R)]
JCTVC-G140 CE4: Subtest 2.1.c Cross Check Report for Mediatek's AQO CaseC by
HKUST [F. Zou, O.C. Au (HKUST)]
JCTVC-G382 CE4 Subtest-2 Adaptive Reconstruction Levels [X. Yu, J. Wang, D. He, G.
M.Cocher, E. Yang (RIM)]
JCTVC-G141 CE4: Subtest 2.2.d Cross Check Report for RIM's ARL Case D by HKUST
[F. Zou, O.C. Au (HKUST)]
JCTVC-G823 CE4: Cross-check for ARL from RIM by Samsung [E. Alshina, J.H. Park]
[late]
JCTVC-G434 CE4 subtest 3: Quantization matrix for HEVC based on JCTVC-F362 and
F475 [Y. Morigami, J. Tanaka, T. Suzuki (Sony)]
JCTVC-G502 CE4 Subtest 3.1: Crosscheck report of AVC based quantization matrix
support (JCTVC-G434) [M. Shima (Canon)]
JCTVC-G527 CE4: Cross-Check report for CE4 subset3 of Sony proposal on Quantization
matrix for HEVC (JCTVC-G434) [J. Zheng (HiSilicon)]
4.5.1 Summary
JCTVC-G035 CE5: Summary report on CAVLC entropy coding improvements [X. Wang,
P. Wu, C. Kim (CE Coordinators)]
4.5.2 Contributions
JCTVC-G310 CE5: CAVLC Adaptation using Difference Counter [T. Yamamoto (Sharp)]
JCTVC-G360 CE5: Redundancy removal for Run-mode in CAVLC (JCTVC-F286) [J. Xu,
A. Tabatabai (Sony)]
JCTVC-G389 CE5: CAVLC coding table modification [S. Kim, J. Lee, S. Lee (Yonsei
Univ.), C. Kim, Y. Park, J. Park (Samsung)]
JCTVC-G532 CE5 2.1 : Improvement of CAVLC run- coding by prediction mode [C. Kim,
Y. Park, K.P.Choi (Samsung)]
JCTVC-G367 CE5: cross-check for Samsung’s CAVLC (JCTVC-G532) [J. Xu, M. Haque
(Sony)]
JCTVC-G563 CE5 2.2 : Handling for exception cases longer than 32bit code-word in
CAVLC [C. Kim, Y.Park, K.P. Choi(Samsung), M. Karczewicz, X. Wang,
W.-J. Chien, L. Guo(Qualcomm)]
JCTVC-G677 CE5: Limitation on VLC codeword length [M. Karczewicz, X. Wang, W.J.
Chien, L. Guo (Qualcomm)]
4.6.1 Summary
CE6a.1 Modified down-sample filters: Best (T0L0I0 vs T0L2I2) has some gain (0.1% for luma and 0.3%
for chroma) but longer filters. Seems like not enough gain.
CE6a.2 Alpha and beta calculation complexity reduction: Almost no loss was observed for 2:1
subsampling 16x16. It was commented that this does not help since it makes a special case out of a case
that is not the worst case – whereas the worst case is 4x4.
It was noted that there is a relevant non-CE contribution by Canon (JCTVC-G244).
CE6a.3 Reduction of Storage for Reconstructed Luma Pixels – noting that the focus is now on the 8 bit
case, which already is using 8 bit storage in this case – leaving it alone sounds reasonable.
Note also that the focus is now on the CABAC case – CAVLC is no longer particularly interesting.
CE6c. SDIP
Case 1: Performance of SDIP on HM AI HE: −1.35% (127% enc RT) −2.14% (with class F)
Case 2: Each rectangular sub-PU are using different prediction mode −0.96% (113% enc RT) −1.70
(with class F). case 2 is optimized mode selection vs. case 1
Case 3: All rectangular sub-PU are using same prediction mode for entire CU −0.96% (114% enc RT)
−1.39 (with class F)
Q: How are NSQT concerns on 8x2/2x8 throughput and coefficient scanning addressed as they also
apply to SDIP? Expert comment that throughput for intra prediction may be more of an issue than inter
prediction.
A: 16 sample block unit read for both 4x4 and 2x8/8x2.
Several experts noted gains desirable but requires discussion with hardware expert on complexity;
multiple functions affected.
SDIP vs NSQT – SDIP has additional TUs, prediction affected
Similarity in spirit of SDIP and NSQT. Suggested separate profile to contain SDIP, NSQT, ALF, & SAO
(?)
It was noted that, at the moment, NSQT does not have a syntax flag to disable its selection.
G556 was suggested as the primary document to review for study of SDIP.
Other SDIP related proposals that add about 0.4% (for luma AI) additional compression benefit (1.7% for
chroma). There is a survey of this in G558.
Non-CE related SDIP-related G135, G354, G598.
SDIP plenary discussion was held on Monday 28th (chaired by J.O.), with notes recorded as follows:
1.3% gain with the current version
SDIP should be harmonized / unified with SDIPNSQT, which is not achieved yet
Problem: So far NSQT and SDIP were discussed in different places
Set up AHG to harmonize, with intention to have the unified solution in the standard by the next
meeting.
Basis for AHG is the “SDIP reference” G558+G754. This is not to be further studied in CE. New
SDIP proposals that build on top of this are to be investigated in CE6 (not in AHG)
Experiment A
JCTVC-G168 CE6: Cross-check report for Subtest CE6a on Intra Chroma Prediction [S.
Cho, S. Lee (ETRI)] [late]
JCTVC-G169 CE6: Cross-check report for Subtest CE6a on Intra Chroma Prediction [S.
Cho, S. Lee (ETRI)] [late]
JCTVC-G512 CE6: combination of subtest 4.1.2.1 & 4.1.2.2 [K. Sato (Sony)] [late]
JCTVC-G847 CE6: Cross-check results for combination of subtest 4.1.2.1 & 4.1.2.2
(JCTVC-G512) [S. Cho, S. Lee, N. Eum] [late]
JCTVC-G192 CE6b: Intra remaining mode coding with mode ranking [J. Park, B. Jeon
(LG)]
JCTVC-G203 CE6b: Intra prediction mode coding [T.-D. Chuang, C.-Y. Chen, M. Guo, X.
Guo, Y.-W. Huang, S. Lei (MediaTek), W.-J. Chien, X. Wang, M.
Karczewicz (Qualcomm)]
JCTVC-G242 CE6b: Mode ranking for remaining mode coding with 2 or 3 MPMs [E.
François, S. Pautet, C. Gisquet (Canon)]
JCTVC-G243 CE6b: Intra mode coding with 4 MPMs and mode ranking [E. François, S.
Pautet (Canon), Joonyoung Park, Byeongmoon Jeon (LG), Tzu-Der Chuang,
Ching-Yeh Chen, Mei Guo, Xun Guo, Yu-Wen Huang, Shawmin Lei
(MediaTek), Wei-Jung Chien (Qualcomm), Ehsan Maani, Ali Tabatabai
(Sony)]
JCTVC-G869 CE6: Combinations of MPM derivation and remaining mode coding [Ehsan
Maani, Ali Tabatabai] [late]
JCTVC-G080 CE6: Cross-check report for Subtest CE6b on Intra Mode Coding [H. L.
Tan, C. Yeo, Y. H. Tan (I2R)]
JCTVC-G167 CE6: Cross-check report for Subtest CE6b on Intra Mode Coding [S. Cho,
S. Lee, N. Eum (ETRI)] [late]
JCTVC-G868 CE6: Test results of DCIM [Ehsan Maani, Ali Tabatabai, Tomoyuki
Yamamoto] [late]
Experiment C
JCTVC-G143 CE6.c: VLC improvement for intra partitioning on SDIP [J. Lim, B. Jeon
(LG)]
JCTVC-G478 CE6.c: Crosscheck for LG's VLC improvement for intra partitioning on
SDIP in JCTVC-G143 [T.-D. Chuang, Y.-W. Huang (MediaTek)]
JCTVC-G800 CE6.c Crosscheck report for LG's JCTVC-G142 and JCTVC-G143 [C. Lai]
[late]
JCTVC-G267 CE6.c Report on SDIP chroma extension scheme [J. Song, C. Lai, H. Yang,
H. Yu (Huawei)]
JCTVC-G558 CE6.c Report on Combination of SDIP and Its Improvements [X. Cao, Y. He
(Tsinghua), X. Peng (USTC), C. Lai, L. Liu, J. Zheng (HiSilicon), J. Xu
(Microsoft), H. Yang, J. Song, H. Yu (Huawei), J. Lim, B. Jeon(LGE), J.
Sole, R. Joshi, X. Wang, M. Karczewicz (Qualcomm), J. Xu, E. Maani, A.
Tabatabai (Sony)]
Contains multiple topics, some of it was in the CE and some is different.
JCTVC-G369 CE6.c: cross-check for SDIP (JCTVC-F532) [J. Xu, A. Tabatabai (Sony)]
Experiment D
JCTVC-G279 CE6 Subtest d: direction-based angular intra prediction [M. Guo, X. Zhao,
X. Guo, S. Lei (MediaTek)] [late]
JCTVC-G280 CE6 Subtest d: Intra Prediction with Secondary Boundary [M. Guo, X. Guo,
X. Zhao, S. Lei (MediaTek), J. Lainema, K. Ugur (Nokia), K. Sugimoto, S.
Sekiguchi, A. Minezawa (Mitsubishi), J. Lee, S.-C. Lim, H.Y. Kim, J.S. Choi
(ETRI)]
JCTVC-G420 CE6.d: Results of experiment 4.4.3 [J. Lee, S.-C. Lim, H. Y. Kim (ETRI)]
JCTVC-G565 CE6.d: Nokia report on intra prediction with secondary boundary [J.
Lainema, K. Ugur (Nokia)]
JCTVC-G081 CE6: Cross-check report for Subtest CE6d on Intra prediction with
secondary boundary (Test 7) [H. L. Tan, Y. H. Tan, C. Yeo (I2R)]
4.7.1 Summary
4.7.2 Contributions
JCTVC-G581 CE7: Crosscheck of combination of tool 1 and tool 2 for mode dependent
secondary transform sizes 3x3 and 4x4 (JCTVC-G108) [R. Joshi
(Qualcomm)]
JCTVC-G930 CE7: Cross check report of “On secondary transforms for intra prediction
residual (G108)” [A. Ichigaya, (NHK)] [late]
JCTVC-G304 CE 7: Experimental Results for the ROT [Z. Ma, F. Fernandes, E. Alshina,
A. Alshin (Samsung)]
JCTVC-G375 CE7: Cross Check Report for CE7 Tool 3, Rotational Only Transform [Y.
Shibahara, T. Nishi (Panasonic)]
4.8.1 Summary
Subtest B:
G498 Simplified ALF design. Only 5x5 diamond shape, no pixel classification, no DC offset. More
coarse quantization and coding.
Current implementation does not process chroma and does not include slice boundary processing.
Looking at the loss of 1%, several experts express the opinion that we should not replace the current ALF
by this. Note: G499 is an improved version with lower loss.
Subtest C:
(G212 presented in this context, is combination of G208/G206 (“option 1”), and something new on SAO
LB reduction (“option 2”). Also G211 was considered here.
G208 uses “virtual boundary processing” = specific boundary padding method with some irregularity
which could also be implemented differently e.g. by adjusting filter coefficients
Investigate visual quality of G212 option 1 against HM, adopt when it does not produce artifacts.
20 viewers “Score based method” was used, where experts gave 1 point for anchor and prop each when
they were equal, and otherwise 2/0 or 0/2 if one was better.
(anchor/proposal) BQM 18/22, Cactus 22/18, Vidyo3 15/25, Vidyo4 20/20
Conclusion: No visual difference. Even 15/25 means that still 75% of the subject thought both are equal.
Decision: Adopt G212 option 1
G207 (c.5) proposes a method to perform padding for SAO at slice boundaries (which would obviously
increase the complexity). This shall also be investigated in the subjective test to identify whether there is
a problem with SAO at slice boundaries (in case where across-boundary processing is disabled), but we
would not adopt it at this meeting as it appears inconsistent to have something for SAO but not for de-
blocking where the same problem occurs. There may also be other solutions such as post-processing of
boundaries. (Note G194 is also related). Result of subjective viewing: It looks better than anchor.
Investigate combined solution for ALF and SAO for slice and tile boundaries in CE, for de-blocking
currently no proposal on the table, but a solution which also includes de-blocking would be desirable.
Subtest D:
G208 (9x9 cross shape) performs best and has better performance than G648 (7x11). G208 will be tested
together with G206 such that reducing line buffers by decreasing vertical filter size is not relevant
anymore (same applies to G130 which reduces vertical filter size only for chroma and produces losses)
Subtest E:
(G665) Prediction of filter coefficients from other coefficients of the same filter (instead of from one filter
to the next). Gain of 0.1% observed in LD B and P cases. Note: G610 is a similar idea that provides more
gain.
Subtest G:
Adds a third mode where within a slice the ALF applied to chroma is invoked whenever luma is filtered
(currently it is either entirely on or off). Only marginal gain (0.2% for chroma only). – also one more
encoder decision.
No support by other companies. No action.
Note: Question was raised what would be the performance when chroma always follows luma, and some
experts expressed that they would like such a solution, however other experts expressed it might be
dangerous to do this, as there may be good reasons to switch it entirely on/off.
4.8.2 Contributions
JCTVC-G1023 CE8 subset 0: Improved ALF N pass encoding [I. S. Chong, M. Karczewicz,
T. Yamakage, T. Watanabe, T. Chujoh, C.-Y. Chen, C.-M. Fu, C.-Y. Tsai,
Y.-W. Huang, S. Lei] [late]
This evaluates a modified ALF N pass encoding algorithm. This includes improvement and bugfix of
ALF N pass encoding. Coding efficiency gain for luma is 0.0 %, 0.0 %, 0.1% and 0.2 % in HE-AI, RA,
LB, and LP without encoding/decoding time increase on average.
(non-normative)
Subtest A
JCTVC-G316 CE8.a.1: 2-D mergeable syntax [T. Ikai (Sharp), I. S. Chong, M. Karczewicz
(Qualcomm), T. Yamakage, T. Watanabe, T. Chujoh (Toshiba), C.-Y. Chen,
C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports evaluation results of 2-D mergeable syntax technique [JCTVC-F384]. This
technique aims to free the restriction on block classification in the HM-4.0 ALF. The coding gains of 0.0
%, 0.1 %, 0.1 % and 0.1 % in HE-AI, RA, LB, and LP were reported. The decoding time ratio was 100 %
to 101 % and the encoding time ratio was 99 % to 100 % in HE case. The proposal was cross-checked by
Samsung (JCTVC-G649). Harmonization of the other proposal “CE8.a.2: Directional feature calculation
on subset of pixels” (JCTVC-G609) is also tested in JCTVC-G649.
Subtest B
JCTVC-G498 CE8: ALF with low latency and reduced complexity [A. Fuldseth, G.
Bjøntegaard (Cisco)]
The document describes a low complexity ALF technique suitable for low latency applications. One
single set of ALF filter coefficients are computed quantized and transmitted sequentially for each block
using a single pass technique. The proposed ALF also has low complexity by using only a 5x5 diamond
shape and no decoder-side variance calculations. When applied to low complexity configurations, BD-
rate gains between 1.4 % and 3.7% are reported. When applied to high efficiency configurations, BD-rate
losses between 0.5% and 1.1% are reported.
JCTVC-G864 CE8 Subset b.1: Cross check of Cisco's ALF with low latency and reduced
complexity [I. S. Chong, M. Karczewicz]
Subtest C
JCTVC-G564 CE8 subtest c tool 1: Line memory reduction for ALF and SAO decoding [S.
Esenlik, M. Narroschke, T. Wedi (Panasonic)]
This contribution is a part of CE8 on in-loop filtering. Proposed is a method to reduce the line memory
which is required by consecutive filtering operations in the decoder. In the current HM 4.0, Deblocking
Filter (DF), Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) all pose difficulties related
to storage requirements in the block-based decoding procedure. Namely, for the purpose of filtering
across the boundaries of LCUs (Largest Coding Unit), horizontal and vertical line memory need to be
employed which is increases the implementation complexity of decoder chips. This contribution focuses
on the reduction of the line memory for LCU-based decoding. The main focus is the reduction in the so
called horizontal line memory, whose size is directly proportional to the width of the decoded picture.
With the help of the proposed technique the horizontal line memory that needs to be employed is reduced
from 9 lines to 5 lines for the luminance component and from 7 lines to 4 lines for the chrominance
components.
JCTVC-G051 CE8 Subset 3: Cross-check of Panasonic’s line memory reduction for in-loop
filtering (JCTVC-F272) [S. Park, S. Lee, N. Eum (ETRI)]
JCTVC-G479 CE8.c.1: Crosscheck for Panasonic's line memory reduction for ALF and
SAO decoding in JCTVC-G564 [C.-Y. Chen, Y.-W. Huang (MediaTek)]
JCTVC-G204 CE8.c.2: Single-source SAO and ALF virtual boundary processing [C.-M.
Fu, C.-Y. Chen, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE8.c.2. In HM-4.0, SAO requires 0.2 luma pixel line buffers (PLB)
and 0.2 chroma PLB, and ALF requires 4.1 luma PLBs and four chroma PLBs for practical real-time
decoders. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing was proposed in order to
achieve zero line buffer and good visual quality for SAO and ALF. Due to the DF in HM-4.0, luma VBs
and chroma VBs are set as four and two pixels above horizontal LCU boundaries, respectively. For a to-
be-processed pixel on one side of a VB, any pixel on the other side of the VB is avoided by modifying
pixel classification for SAO and filter shapes for ALF. When compared with the JCTVC-F900 anchor, the
proposed method reportedly causes 0.1%, 0.2%, 0.3%, and 0.4% coding efficiency losses for HE-AI, HE-
RA, HE-LDB, and HE-LDP, respectively, and is claimed to have similar visual quality as the anchor. VB
artifacts can only be seen in few pictures.
JCTVC-G559 CE8 Subtest c: Cross-Check report for JCTVC-G204 [S. Esenlik, A. Kotra,
M. Narroschke(Panasonic)]
JCTVC-G205 CE8.c.3: Multi-source SAO and ALF virtual boundary processing [C.-Y.
Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek), S. Esenlik, M.
Narroschke, T. Wedi (Panasonic)]
This contribution reports results of CE8.c.3. In HM-4.0, SAO requires 0.2 luma pixel line buffers (PLBs)
and 0.2 chroma PLBs, and ALF requires 4.1 luma PLBs and four chroma PLBs for practical real-time
decoders. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing was proposed to remove
all these line buffers. In JCTVC-F272, partial use of pre-DF pixels as SAO and ALF inputs was proposed
to reduce line buffers. In order to achieve zero line buffer and good visual quality for SAO and ALF, the
two methods are combined as follows. Due to the DF in HM-4.0, luma VBs and chroma VBs are set as
four and two pixels above horizontal LCU boundaries, respectively. For to-be-processed pixels above the
VB, any required pixel below the VB is replaced by a pre-DF pixel. For to-be-processed pixels below the
VB, any pixel above the VB is avoided by modifying pixel classification for SAO and filter shapes for
ALF. When compared with the JCTVC-F900 anchor, the proposed method reportedly causes 0.1%, 0.1%,
0.1%, and 0.2% coding efficiency losses for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and is
claimed to have similar visual quality as the anchor. Very minor VB artifacts can only be seen in very few
pictures.
JCTVC-G206 CE8.c.4: SAO and ALF virtual boundary processing with cross9x9 [C.-Y.
Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek), S. Esenlik, M.
Narroschke, T. Wedi (Panasonic)]
This contribution reports results of CE8.c.4-1 and CE8.c.4-2, which are a combination of CE8.c.2 using
single-source SAO and ALF virtual boundary (VB) processing and CE8.d.1 using cross9x9 and
snowflake5x5 and a combination of CE8.c.3 using multi-source SAO and ALF VB processing and
CE8.d.1, respectively. When compared with the JCTVC-F900 anchor, CE8.c.4-1 reportedly achieves
0.0%, -0.2%, -0.3%, and 0.1% BD-rates for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and
CE8.c.4-2 reportedly achieves 0.0%, -0.2%, -0.4%, -0.1% BD-rates for the four conditions respectively,
where negative numbers mean gains and positive numbers mean losses. The gain of using cross9x9 is
roughly unchanged when CE8.c.2 and CE8.c.3 are considered. It is also reported that both CE8.c.4-1 and
CE8.c.4-2 have subjective qualities close to HM-4.0, and CE8.c.4-2 is better than CE8.c.4-1.
JCTVC-G207 CE8.c.5: Non-cross-slices SAO [C.-M. Fu, C.-Y. Tsai, C.-Y. Chen, Y.-W.
Huang, S. Lei (MediaTek), M. Budagavi (TI)]
This contribution reports results of CE8.c.5. In HM-4.0, non-cross-slices SAO skips each to-be-processed
pixel requiring any pixel from any other slice. However, the skipping technique may cause some potential
problem in visual quality. In JCTVC-F093 and JCTVC-F232, any pixel from any other slice is avoided by
Subtest D
JCTVC-G208 CE8.d.1: Snowflake5x5 and cross9x9 for luma and chroma ALF shapes [C.-
Y. Tsai, C.-Y. Chen, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), I. S. Chong,
M. Karczewicz (Qualcomm), T. Yamakage, T. Watanabe, T. Chujoh
(Toshiba)]
This contribution reports results of CE8.d.1. In HM-4.0, snowflake5x5 and cross11x5 filter shapes have
nine and eight multiplications, respectively, and are used for both luma and chroma in ALF. In this
proposal, snowflake5x5 and cross9x9 are used, and they both have nine multiplications to better utilize
multipliers without increasing the number of multipliers in hardware. Simulation results reportedly show
0.1%, 0.4%, 0.6% and 0.3% coding efficiency gains for HE-AI, HE-RA, HE-LDB, and HE-LDP,
respectively, with roughly the same encoding time and 1%-2% decoding time increase. Apparently
cross9x9 needs more line buffers than cross11x5, so it is suggested to combine this proposal with ALF
line buffer removal techniques.
JCTVC-G648 CE8 Subtest d, Tool 2: ALF filters with 9 coefficients and up to vertical-size
7 [P. Lai, F. C. A. Fernandes (Samsung), H. Guermazi, F. Kossentini,
M.Horowitz (eBrisk)]
This contribution presents ALF method using two filter shapes both having 9 coefficients: star-5x5 and
cross-11x7. Compared to the previously adopted proposal JCTVC-F303, one coefficient has been added
to construct cross-11x7, and extended its vertical-size to 7. The proposed method reports gains of
0.1/0.2/0.3/0.2 BD-rate for AI/RA/LDB/LDP structures as compared to HM4.0, with average of 2% / 1%
enc / dec time increases on Linux cluster server.
JCTVC-G130 CE8 Subtest d - Chroma ALF with reduced vertical filter size [M. Budagavi,
V. Sze, M. Zhou (TI)]
This contribution presents results for Nx3 chroma ALF filters when integrated into HM 4.0. Nx3 chroma
ALF filters have a vertical size of 3 when compared to filters in HM 4.0 which have vertical size of 5.
The filter set of 7x3 diamond + 11x3 cross with 3x3 center is reported to have following BD-Rate results
(Y, U, V): AI-HE: 0.0%, 0.4%, 0.5%; RA-HE: 0.0%, 0.8%, 0.5%; LB-HE: 0.0%, 0.7%, 0.5%;LP-HE:
0.0%, 0.7%, 0.4%. The worst case number of multiplications for this filter set is reported to be the same
Subtest E
Subtest F
Subtest G
JCTVC-G235 CE8.h: CU-based ALF with non-local means filter [M. Matsumura, S.
Takamura, H. Jozawa (NTT)]
This contribution reports the performance of a technique that utilizes a denosing filter as the in-loop filter
of HM codec. In the proposed method, a denoising filter called non-local means filter is unified into CU-
based adaptive loop filter of HM4.0.
Compared to the anchor of HM4.0, the average BD-rate gains were –0.1, –0.2, –0.3, and –0.5% for Intra,
Random Access, Low Delay B, and Low Delay P, respectively. The average decoding time increased 2 to
5%. The maximum gain was –1.4% in Low Delay P for the sequence “BasketballDrive”.
JCTVC-G299 CE8 Subtest h: Cross verification of NTT’s CU-based ALF with NLM filter
(JCTVC-G235) by Intel [Y. Chiu, L. Xu (Intel)]
JCTVC-G482 CE8.h.1: Crosscheck for NTT's CU-based ALF with non-local means filter
in JCTVC-G235 [C.-Y. Tsai, Y.-W. Huang (MediaTek)]
4.9.1 Summary
4.9.2 Contributions
JCTVC-G084 CE9: Test results on SP01, SP02, SP03 and SP04 [M. Zhou (TI)]
This document reports CE9 test results on SP01, SP02, SP03 and SP04 which are related to temporal
MVP. Test results reveal that disabling the centre TMVP position from both the merge/skip and AMVP
MVP list derivation process leads to a loss of 0.1% in all configurations(SP01); removing TMVP from
JCTVC-G706 CE9: Cross-check report for TI's JCTVC-F083 by Motorola Mobility [Y.
Yu, K. Panusopone, L. Wang (Motorola Mobility)]
JCTVC-G689 CE09: Crosscheck report of JCTVC-G084 test SP04 [Y. Zheng, X. Wang
(Qualcomm)]
JCTVC-G085 CE9: Test results on parallelized merge/skip mode [M. Zhou (TI)]
This document reports CE9 test results on parallel merge/skip mode. The current HEVC merge/skip mode
design is highly sequential and introduces dependency among neighboring PUs, which can lead to
significant quality loss if motion estimation (ME) is performed in parallel for throughput or
implementation cost reasons. For typical parallel ME level of 32x32, the measured average loss is 5.0% in
RA-HE, 5.3% in RA-LC, 6.7% in LB-HE and 7.8% in LB-LC. The loss is caused by fact that the
merge/skip mode cannot be tested for those PUs inside the 32x32 block whose neighboring motion data
are still unavailable during the parallel processing process. It is proposed to add a high-level syntax
element to signal the parallel level of merge/skip mode, divide a LCU into parallel motion estimation
regions (MERs) and allow only those neighboring PUs which belong to different MERs from the current
PU to be included in the merge/skip MVP list construction process. Simulation results reveal that an
average gain of 3.4% in RA-HE, 3.5% in RA-LC, 4.4% in LB-HE and 5.0% in LB-LC can be achieved
for 32x32 block level parallel ME when compared to the current HM4.0 design, and for parallel level
16x16 that is used today, the average gain is 1.9% in RA-HE, 1.8% in RA-LC, 2.6% in LB-HE and 2.5%
in LB-LC. The proposed design is backward compatible to the current design but offers flexibility for
high throughput and high quality encoder designs.
JCTVC-G702 CE9: Simplification of MVP Design for HEVC [Y. Yu, K. Panusopone, L.
Wang (Motorola Mobility)]
This document reports the results of Motorola Mobility’s simplification of MVP design for HEVC. They
are AMVP_SEL03 and AMVP_SEL04 tests specified in the CE9. Simulation results show that there is a
no loss for low delay and 0.2% loss for random access conditions compared to original AMVP while the
complexity of the proposed method is reduced by half as compared to the MVP selection procedure of
AMVP.
4.10.1 Summary
4.10.2 Contributions
JCTVC-G266 CE10: Lossless Core Transforms for HEVC [W. Dai, M. Krishnan, J.
Topiwala, P. Topiwala (FastVDO), E. Alshina (Samsung)]
JCTVC-G737 CE10: Full Factorization Core Transforms for HEVC [E. Alshina, A. Alshin,
W. Lee, J. Park, K. Pachauri (Samsung), P. Topiwala (FastVDO)]
JCTVC-G863 CE10: Crosscheck of FastVideo/Samsung core transforms for high and low
QP range [Rajan Joshi] [late]
JCTVC-G495 CE10: Core transform design for HEVC [A. Fuldseth, G. Bjøntegaard
(Cisco), M. Budagavi (TI)]
JCTVC-G953 CE10: Cross-check of JCTVC- G579 core transform - low and high QP
range [M. Budagavi (TI)] [late]
JCTVC-G819 CE10: Cross check for core transform proposed by Qualcomm by Samsung
[E. Alshina, J.H. Park] [late]
JCTVC-G887 CE 10: hardware test of inverse transform proposals [Sumit Johar, Daniele
Alfonso (STM)] [late]
4.11.1 Summary
4.11.2 Contributions
JCTVC-G121 CE11: Reduction in contexts used for coefficient level [V. Sze (TI)]
JCTVC-G327 CE11: Cross-check of TI’s reduction in contexts used for coefficient level
(JCTVC-G121) [J. Sole (Qualcomm)]
JCTVC-G269 CE11 Report on Prediction Unit Dependent Coefficient Scanning For Inter
Frame [J. Song, X. Zheng, H. Yang, H. Yu (Huawei)]
JCTVC-G284 CE11: Extended Mode Dependent Coefficient Scanning [X. Zhao, X. Guo, S.
Lei (MediaTek), S. Ma, W. Gao (PKU)]
JCTVC-G321 CE11: Removal of the parsing dependency of residual coding on intra mode
[J. Sole, Y. Zheng, W.-J. Chien, R. Joshi, X. Wang, M. Karczewicz
(Qualcomm)]
JCTVC-G679 CE11: Extending horizontal and vertical scan to big block for CAVLC [M.
Karczewicz, Y. Zheng, L. Guo, X. Wang (Qualcomm)]
JCTVC-G703 CE11: Adaptive Scan for Large Blocks for HEVC [Y. Yu, K. Panusopone, J.
Lou, L. Wang (Motorola Mobility)]
JCTVC-G077 CE11: Cross-check of CE.B1 - Scans for large blocks in CAVLC [C. Yeo
(I2R)]
JCTVC-G975 CE11: Crosscheck - Adaptive Scan for Large Blocks for HEVC (G703) [T.
Nguyen (Fraunhofer HHI)] [late]
4.12.1 Summary
4.12.2 Contributions
Subtest 1
JCTVC-G286 CE12 Subtest 1: Chroma Deblocking Filter [Q. Huang, J. An, X. Guo, S. Lei
(MediaTek), A. Norkin, K. Andersson, R. Sjöberg (Ericsson)]
This contribution presents experimental results for the chroma deblocking filter in CE12 Subtest 2. In
specific, chroma deblocking filter with independent filtering decision and 8x8 filtering unit is tested and
proposed. It is reported that, average BD-Rate reduction of 0.6% can be achieved for chroma. The run
time is reported to be similar to HM4.0. It is also reported that the subjective quality is almost the same as
that of HM4.0.
JCTVC-G383 CE12 Subtest1: Deblocking of New Non-Square Blocks: Edge Shift for AMP
[G. Van der Auwera, X. Wang, M. Karczewicz (Qualcomm)]
This proposal addresses the adaptation of the deblocking filter in case of AMP partitions of size 16x4 or
4x16. Instead of deblocking the central edge on the 8x8 deblocking grid inside the 16x16 CU of the AMP
type, the relevant internal AMP partition edge is deblocked, which keeps the number of filtering
operations unchanged. The deblocking filter width is adapted to avoid filtering dependencies between
nearby edges. The BD-rates and execution times are very similar to the HM4 anchor.
JCTVC-G485 CE12.1.5: Crosscheck for Qualcomm's AMP deblocking with edge shift in
JCTVC-G383 [T.-D. Chuang, Y.-W. Huang (MediaTek)]
JCTVC-G409 CE12, Subset 1: Report of Deblocking for Large Size Blocks [Z. Shi (USTC),
X. Sun, J. Xu (Microsoft)]
This document presents a deblocking scheme for large size blocks to improve the visual quality of HEVC
decoded videos. For large smooth regions with small variation, an extra smoothing deblocking mode is
introduced to suppress the visually severe blocking artifacts. It is observed that the proposed method can
reduce blocking artifacts in smooth regions, which are usually more visible to human eyes.
JCTVC-G673 CE12 Subtest 1: Crosscheck for Microsoft's Deblocking for Larger Blocks in
JCTVC-G409 [Q. Huang, X. Guo (MediaTek)]
JCTVC-G590 CE12 Subtest 1: Results for modified decisions for deblocking [M.
Narroschke, S. Esenlik, T. Wedi (Panasonic)]
This contribution is part of CE12. It presents the results for Modified decisions for deblocking which is
based on JCTVC-E251 and JCTVC-F191. In HM4.0, a first decision for enabling the deblocking is
performed for edge segments of eight lines. In the case of enabled deblocking, a subsequent second
decision is performed for each individual line by which either a strong or a weak filter is selected. In this
proposal, two modifications are introduced. The first decisions are performed for edge segments of four
lines instead of 8 lines. The second decisions are also performed for edge segments of four lines instead
of for each individual line. At the same quality, the following average bit rate reductions are achieved
relative to HM4.0: I-HE: 0.0%, I-LC: 0.0%, RA-HE: 0.0%, RA-LC: 0.0%, LD(B)-HE: 0.2%, LD(B)-LC:
0.1%, LD(P)-HE: 0.2%, LD(P)-LC: 0.0%. The modifications reduce the number of operations required
for these two decisions by around 20%. In addition, the size of line buffers is reduced. They allow parallel
deblocking of all 8x8 blocks.
Page: 74 Date Saved: 2011-12-04
JCTVC-G238 CE12 Subtest1: Cross-verification of Panasonic's proposal JCTVC-G590
[M. Ikeda, T. Suzuki (Sony)]
Subtest 2
JCTVC-G585 CE12 Subtest 2: Cross-check results of the parallel deblocking tool 1 of Sony
(JCTVC-G255) [Matthias Narroschke, Semih Esenlik (Panasonic)]
JCTVC-G587 CE12 Subtest 2: Cross-check results of the parallel deblocking tool 2 of Sony
(JCTVC-G256) [Matthias Narroschke, Semih Esenlik (Panasonic)]
Subtest 3
JCTVC-G228 CE12.3.2: Reducing pixel line buffers by modifying DF for horizontal LCU
edges [C.-W. Hsu, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE12.3.2, which is based on JCTVC-F053 method2 to modify
deblocking filter (DF) only for horizontal LCU boundaries and to keep DF unchanged for rest edges.
Pixels above the first row of the upper side of the horizontal LCU boundary are not used in filtering
decisions. Moreover, filtering operations are also modified without changing pixels above the first row of
the upper side of the horizontal LCU boundary. In comparison with HM-4.0, the proposed method can
remove all pixel line buffers dedicated for DF and reportedly causes 0.0-0.2% coding efficiency loss with
roughly unchanged run time and similar visual quality in most cases.
JCTVC-G229 CE12.3.3: Reducing motion data line buffers [T.-D. Chuang, C.-Y. Chen, Y.-
W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE12.3.3, which is based on the motion data compression method in
JCTVC-F060. Simulation results reportedly show that the proposed method can reduce motion data line
buffer size by 50% with the same coding efficiency, encoding time, and decoding time in comparison
with HM-4.0. No undesirable visual artifact is observed due to the modified motion data for calculating
boundary strengths in deblocking filter.
JCTVC-G292 CE12 Subtest3: Cross Check of Mediatek’s Motion Data Line Buffer
Reduction Proposal JCTVC-G229 [G. Van der Auwera (Qualcomm)]
JCTVC-G257 CE12 Subtest3: Deblocking vertical tap reduction for line buffer based on
JCTVC-F215 [M. Ikeda, T. Suzuki (Sony)]
This contribution proposes to reduce the line buffers required in deblocking filter base on JCTVC-F215.
A lot of line buffers are required in deblocking filter, SAO (Sample Adaptive Offset) and ALF (Adaptive
Loop Filter) in HM-4.0. Especially, deblocking filter is included in both high efficiency and low
complexity and it is considered that deblocking filter is used in many cases, and so it is significant that the
line buffers in deblocking filter are reduced alone. Sony proposes to reduce one line buffer with keeping
BD-Rate and subjective quality by reducing upper one pixel to read in vertical filtering. The experimental
results show 0.0-0.1 increases for luma in BD-rate and similar run-time, and the subjective quality is
similar to HM-4.0.
JCTVC-G486 CE12.3.1: Crosscheck for Sony's deblocking vertical tap reduction for line
buffer in JCTVC-G257 [C.-W. Hsu, Y.-W. Huang (MediaTek)] [late]
Subtest 4
JCTVC-G087 CE12 subset 4.10: Test results on unification of luma and chroma filtering
[M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 4.10 “unification of luma and chroma filtering”. In
the proposed algorithm the unification of luma and chroma filtering is achieved by increasing filter
coefficient precision for chroma filter by 2-bit and restoring HM3.0 delta calculation for luma weak
filter. Simulation results revealed that the proposed unification improved the coding efficiency in luma by
0.2% in AI-HE and AI-LC, and 0.1% in RA-HE, RA-LC, LB-HE and LB-LC, and up to 0.3% gain for
chroma components. However, unified luma weak and chroma filtering led to visual quality loss in
vidyo3 (LB-HE, QP=37) sequence.
Subtest 5
JCTVC-G088 CE12 subset 5.6: Test results and architectural study on de-blocking filter
without parallel on/off filter decision [M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 5.6 “removal of parallel on/off filter decision”.
Architectural study shows that the parallel on/off decision actually restricts architecture choices, increases
implementation costs in terms of memory reads and buffer size without intended throughput benefits. It is
recommended to restore the AVC fashion of on/off filter decision, that is to use the un-filtered samples
for the on/off filter decision of vertical edges, and the inter-mediate filtered samples (i.e. filtered samples
after vertical edge filtering) for the on/off decision of horizontal edges. Test results reveal that this change
leads to 0.0% BD-rate difference, and subjective viewing verifies that there is no visual difference for all
the CE12 selected subjective testing sequences when compared to the HM4.0 anchor.
Subtest 6(?)
JCTVC-G174 CE12: Deblocking filter parameter adjustment in slice level [T. Yamakage,
S. Asaka, T. Chujoh (Toshiba), M. Karczewicz, I.S. Chong (Qualcomm)]
Appropriate parameters for deblocking filter to improve coding efficiency for CE12 are presented.
Offsets to Qp to derive beta and tc in slice level syntax are introduced in order to adjust subjective and/or
objective picture quality. The purpose of this contribution is to provide placeholder to adjust the picture
quality.
When Qp offsets to derive beta and tc offsets are -2 and -5, BD-rate reduction is 0.0% (HE) and 0.4% loss
(LC) on average under the common test conditions, with maximum BD-rate reduction of 0.7% (HE) and
0.3% (LC). When higher Qp (32, 37, 42 and 47) is used, BD-rate reduction is 0.4% loss (HE) and 0.6%
loss (LC) on average, with maximum BD-rate reduction of 0.6% (HE) and 0.3% (LC).
JCTVC-G465 CE12: crosscheck of deblocking filter parameter adjustment in slice level [K.
Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)]
JCTVC-G086 CE12 subset 7.4: Test results on decreasing worst case complexity of
deblocking filter [M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 7.4 “decreasing worst case complexity of deblocking
filter”. By removing the motion vectors from the boundary strength (BS) calculation of the deblocking
filter, the worst case number of operations, and memory access are reduced from (8, 10) to (1, 4), and
from (39, 20) to (7, 4) for the BS calculation of a de-blocking edge in P-frame and B-frame, respectively.
Subjective tests at TI observed visual quality improvement in BQMall + (random access-high efficiency)
+ QP 37 (ringing artifact around diagonal edge has been reduced), and no subjective difference in other
CE12 selected sequences when compared to the HM4.0 anchor. The proposed simplification leads to an
average BD-rate increase of 0.2% in RA-HE, RA-LC, LB-HE and LB-LC configuration.
JCTVC-G1041 reports on informal subjective testing. The results unveil that there is no subjective quality
difference that can be claimed between any of the proposals.
4.13.1 Summary
4.13.2 Contributions
JCTVC-G231 CE13: Results of section 3.1 tests 1, 3d, and 3e on replacing redundant
MVPs and its combination with adaptive MVP list size [J.-L. Lin, Y.-W.
Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports the results of CE13 section 3.1 tests 1, 3d, and 3e, which are based on JCTVC-
F052. In test 1, redundant or empty MVPs are replaced by truncating the first available MVP to integer
precision or by adding a constant value to the first available MVP. In test 3d, test 1 is combined with
adaptive MVP list size by neighboring Merge indices. In test 3e, test 1 is combined with adaptive MVP
list size by current CU size. In comparison with HM-4.0 under JCTVC-F900 common test conditions, it is
reported that test 1 together with a bug-fix achieves 0.2-0.5% coding efficiency gain with 100-103%
encoding time and 98-100% decoding time, test 3d together with a bug-fix achieves 0.0-0.4% coding
efficiency gain with 98-103% encoding time and 99-102% decoding time, and test 3e together with a bug-
fix achieves 0.1-0.5% coding efficiency gain with 100-102% encoding time and 100-101% decoding
time, where the bug-fix alone achieves 0.1% coding efficiency gain and no run time difference.
JCTVC-G236 CE13: Cross-check of Mediatek results section 3.1 [G. Laroche, P. Onno
(Canon)] [late]
JCTVC-G240 CE13: Experiment regarding section 3.5 [G. Laroche, T. Poirier, P. Onno
(Canon)]
This contribution reports the results of experiments for section 3.5 of CE13 as described in JCTVC-F913.
6 experiments have been proposed in the field of parsing robustness for both AMVP and Merge modes.
The four first proposed experiments replace some of the additional candidates of the current HM4.0
motion vector derivation by non-redundant candidates as proposed in JCTVC-F474. These four
experiments correspond to different compromises between the coding efficiency and the complexity
reductions in terms of number of operations for the Merge mode MVP derivation process compared to the
HM4.0. The 2 other experiments deal with the modification of the motion vector predictor index parsing
for AMVP. The best experiment in term of coding efficiency shows an average gain for 4 Inter coding
configurations of 0.3% coding compared to the HM4.0 anchors without any increase number of
operations. Moreover, some configurations divide by 3 the worst case complexity in terms of number of
predictors and number of comparisons with a BDR gain of 0.1%.
JCTVC-G424 CE13: Cross-check report of JCTVC-G240 (section 3.5 test 1 and test 4) [S.-
C. Lim, J. Lee, H. Y. Kim (ETRI)]
JCTVC-G489 CE13: Crosscheck for Canon's section 3.5 tests 5 and 6 on additional MVP
candidates in JCTVC-G240 [J.-L. Lin, Y.-W. Huang (MediaTek)]
JCTVC-G776 CE13: Merge candidates list construction [T. Lee, J. Chen, J. H. Park
(Samsung)]
This document reports the results of “merge candidates list construction” method proposed in document
JCTVC-F402 within the context of CE13. Three tests were done in this contribution: a) encoder
modification of HM4.0; b) Optimal merge list size derived in encoder side and signal it in slice head; c)
inserting additional merge candidates. Experiments show that the encoder side fix achieves average -
0.11% BD rate saving for six inter configurations without enc/dec time change. Merge list size signaling
in slice header shows average 0.08% BD rate loss with 96.9% encoding time. Inserting additional merge
candidates shows average -0.17% BD rate saving for six inter configurations based on encoder fix version
with 102.8% encoding time.
JCTVC-G539 CE13: Cross-check report of subset 3.3 (test2a, 3a, 4a) by Panasonic [T.
Sugio, T. Nishi (Panasonic)]
JCTVC-G540 CE13: Cross-check report of subset 3.2 series by Panasonic [T. Sugio, T.
Nishi (Panasonic)]
Revisit: Side activity regarding CE9/CE13 decisions still going on about testing the combination of
adoptions. Will be reported on Monday morning.
JCTVC-G111 Common test conditions to specify 8-bit internal bit depth for all 8-bit source
material [T. Hellman, Y. Yu, W. Wan (Broadcom)]
This proposal recommends changing the common test conditions to specify an internal bit depth (IBD) of
8 bits for all 8-bit source material. It presents results that show a coding loss from this change relative to
the current 10-bit IBD configuration, but claims that the cost of 10-bit IBD cannot be justified. A cost
increase of 20-23% is reported for a sample hardware codec implementation as well as additional memory
and bandwidth costs. It also recommends keeping 10-bit IBD for 10-bit source material, in preparation for
a future 10-bit encoding profile.
Loss of 8 bit relative to 10 bit for HE is bigger in chroma (10% for RA & LB, including Class F) than
luma (2.2% for RA & LB), and is focused in particular sequences (removing one sequence drops the
chroma average gain to about 6%).
Class F is included in the results
Excluding class F, there is more gain, as there is essentially no observed gain on class F.
It is mentioned in the discussion that 10 vs. 8 bit most likely would be a profiling issue. Several experts
anticipate that the definition of an 8-bit 4:2:0 profile with high coding efficiency will be defined.
(confirmed by Broadcom, Docomo, Cisco and TI)
Would a 10-bit (or higher) profile be in the first version of the standard? (several experts say that this
could be deferred to a later version, at least not at high priority)
For common test conditions (that will need to be re-defined as there is only one entropy coder now):
Only 8-bit test settings in common conditions
The capability of higher bit depth should be retained in the software and the spec
Keep a mode to test with higher bit depth to verify that any tools are in principle extensible (10?
12?)
Also include parameter settings for tool combinations that are not currently checked in common
conditions
The latter two points could be done for a largely reduced test set (as it is not about compression
performance but rather sanity check)
Suggestion to define one test point with ALF on and one off. Some experts argue that also SAO should be
on/off at these points -– agreed. Would also be beneficial to test interpolation filters with SAO off.
Decision (SW): It was agreed that we need a high coding efficiency profile that has only 8 bit decoding
capability. And we should have a set of common test conditions that corresponds to that.
It was commented that it may be that we may not define a profile with greater than 8 bit capability in the
first version of the standard (e.g. so that we could later define a single profile that covers both greater than
8 bit capability and higher-resolution chroma formats).
It was commented that the current 10 bit sequences are rather noisy, so if we don't have 10 bit encoding in
the common conditions, we may no longer need those sequences.
We definitely want to retain higher bit depth capability in the design and software. So we should still
include higher bit depth capability in the common conditions.
This was agreed.
It was suggested to develop a supplemental set of tests for other aspects as well that differ from the main
common conditions (e.g. in QP and CU size as well as bit depth). This idea was supported.
It was suggested that perhaps our LC common conditions should no longer include SAO.
JCTVC-G136 Suggestion on picture quality hierarchy for Low Delay configurations [S.
Liu, X. Zhang, S. Lei (MediaTek)]
This contribution proposes a modification to the picture quality hierarchy for Low Delay settings in the
current HM. It is proposed to replace the current multi-level hierarchical picture quality structure by a
two-level scheme. Average 0.2-0.3% BD-rate reduction is reported for Luma and average 1.8% BD-rate
reduction is reported for Chroma. No impact on encoding or decoding time is reported.
Would this affect the visual quality (as quality fluctuations are at lower frame rate)?
Gain is relatively small. Subjective characteristics were discussed.
Changes may tend to make historical comparisons more tricky – gain seems not so large as to justify a
change at this time.
JCTVC-G150 Proposed Error Pattern Files for JCT-VC [S. Wenger (Vidyo)]
Adopted as the current preferred method for testing robustness characteristics and proposals.
JCTVC-G855 Performance evaluation of full search mode decision for Intra of HM4.0 [C.
Lai, Y. Lin, L. Liu, J. Zheng (HiSilicon)] [late]
Late information document (not presented in detail, no action expected) – available for study.
JCTVC-G732 Study on test materials in common test condition [T. Suzuki (Sony)] [late]
TBA
In JCTVC-E011, the problem of the class E test materials in the common test condition was reported. The
contribution investigates the reason why this is happened and proposes to reconsider to replace the class E
test materials.
Discusses problems detected on class E sequences. It was reported to seem likely that the material was
produced by interlaced camera with compression & de-interlacing.
It was reported that the pixel values are changed frame by frame, even at the still area. This phenomenon
could impact on the evaluation of the video coding tools. For example, tools to use this line by line
change could improve coding efficiency in the current design.
In discussion, it was indicated by the sequence contributor that it was captured by a Sony camera that is
switchable between 1080i and 720p.
The company (Vidyo) that contributed the Class E sequences indicated that they should be able to provide
new material with similar scene content. The group indicated that such a contribution is requested and
would be appreciated. This can be collected and made available in an AHG activity in advance of the next
meeting (chair T. Suzuki).
JCTVC-G078 Information for HEVC scalability extension [J. Boyce, D. Hong, W. Jang, A.
Abbas (Vidyo)]
For information to JCT-VC. Out of scope of current phase of work.
JCTVC-G248 Low Complexity scalable extension of HEVC intra pictures [S. Lasserre, F.
Le Léannec, E. Nassor (Canon)]
This contribution presents a new approach for scalable extension of HEVC INTRA pictures. This scalable
INTRA codec design targets coding efficiency together with very low complexity. Spatial random access
and a high degree of parallelism are two additional targeted features. The proposed scalable INTRA
codec employs only one coding mode, which is inter-layer intra prediction, which provides low
complexity. Coding efficiency is obtained through statistical modeling of DCT channels to encode, rate
distortion optimal quantifiers that are pre-computed off-line, coupled with a distortion allocation process
between DCT channels. Overall, bit rate increase of 12.8% is obtained relative to HEVC single layer
coding on tested sequences in dyadic spatial scalability mode. Finally, non-contextual, non-adaptive
entropy coding provides the spatial random access feature.
Was presented Tue. 29th afternoon in track A.
For information to JCT-VC. Out of scope of current phase of work.
JCTVC-G949 Draft requirements for the scalable enhancement of HEVC [A. Luthra]
[late]
For information to JCT-VC. Out of scope of current phase of work.
JCTVC-G950 Draft use cases for the scalable enhancement of HEVC [A. Luthra] [late]
For information to JCT-VC. Out of scope of current phase of work.
JCTVC-G951 Draft Call for Proposals on the Scalable Video Coding Extensions of HEVC
[A. Luthra (Motorola)] [late]
For information to JCT-VC. Out of scope of current phase of work.
5.4.2 Stereo/Multi-view
5.4.3 Interlace
JCTVC-G170 On issues for interlaced format support in HEVC standard [K. Chono, H.
Aoki (NEC)]
JCTVC-G296 Picture-adaptive Field/Frame Coding: support for legacy video [O. Bar-Nir
(Harmonic)]
JCTVC-G450 High level syntax to support interlace format [K. Sugimoto, A. Minezawa, S.
Sekiguchi (Mitsubishi)]
JCTVC-G667 HEVC field coded sequences vs. deinterlaced progressive coding [C. Fogg]
[late] [miss]
JCTVC-G877 Interlaced and 4:2:2 color format support in HEVC standard [J. Vieron]
[late]
General
Discussion:
Will interlaced displays still exist in the future?
Interlaced content still exist and continues to exist (interlaced cameras)
Could this be de-interlaced?
If done, it should be simple
De-Interlacing at encoder end would double necessary throughput
Can this be solved with an SEI message like approach? SEI would not allow frame/field
adaptivity, as it is not possible to have two different picture sizes in one sequence.
Anything that involves a mode decision is undesirable. Picture adaptive frame/field is
undesirable.
Would it be useful to invoke an interlace SEI message per profile/level? Some experts think this
is desirable. (H.263 has such an option for another SEI message)
Work on specify candidate text for SEI message on field coding in an AHG.
Noting that there are basically no interlaced displays anymore (and the availability of source deinterlacing
technology), it was asked why we would bother with this.
Legacy content and legacy camera equipment were cited as a rationale, and it was noted that new cameras
are still being manufactured that generate new such content. It was suggested that the fact that prior
standards include interlace-oriented features might create a need for persistence of support of prior
technologies in addition to support of HEVC.
Regarding deinterlacing at the source, a doubling of codec throughput requirements was suggested to be
unacceptable.
As a reference example, H.263 Annex W (subclause W.6.3.11) was mentioned.
It seems more than likely that if we do not do something, at least as an SEI message, others will do so in a
less interoperable fashion. The associated indicator (as shown by H.263 Annex W) seems simple to
specify.
Suggestion: "Have an enable flag at the VUI level with a top/bottom flag SEI on each picture".
Is it possible to profile an SEI message? It was noted that H.263 Annex X has such a thing.
Plan: Establish AHG with mandate to develop candidate text for an SEI message signaling an interlaced
format indicator. Suggestion quoted above as a starting point for that work.
1. Parallel deblocking
JCTVC-G171 Parallel deblocking filter [J. Yu, S. Yang, J. Byun, Y. Kim, J. Kim (Yonsei
Univ.)]
A parallel deblocking filter for HEVC is proposed. The proposed technique includes the parallelization of
the filtering process and decisions. Our proposed algorithm is based on Panasonic’s parallel deblocking
filter decisions, which were first presented in JCTVC-D214. For the deblocking of a current coding unit,
all required decisions and filtering are performed on the basis of the unfiltered pixels of the current coding
unit. This method can eliminate the dependencies present in the current coding unit and between
neighboring coding units. Therefore, the proposed deblocking filter can be implemented in parallel
The proposals presented in this section do not modify HM4.0 behavior but modify the working draft text.
JCTVC-G175 BS decision tree simplification [S. Park, N. Park, B. Jeon (LGE), X. Guo, J.
An, C. Hsu, Y. Huang, S. Lei (MediaTek)]
This contribution reports the simplification of BS (boundary strength) decision tree in the deblocking
filter. The proposed simplification removes redundant BS values with related conditions. The number of
BS values is changed from 5 to 3 and this modification provides the same BD rate and visual quality as
HM4.0.
The check for the CU boundary (Bs = 4) is removed. The proposed BS values are 0, 1, 2.
Recommendation: Makes sense, is related to contributions G620 part 1 and G638, parts 1 and 2.
Subjective test: not needed, identical to HM4.0
Part 2: Check for the CU boundary (Bs = 4) is removed. Decreasing the number of Bs values to 0, 1, 2.
This part of the proposal is identical to G175.
JCTVC-G1035 BoG report on resolving deblocking filter description [A. Norkin et al.]
Removing 4x4 block boundary OK as it is not likely that any proposal would be adopted that uses them
Signalling of de-block parameters (beta, tc_offset, on/off) both in APS and slice header – slice can inherit
from APS or use own params
Check with APS experts whether the design of the APS params is appropriate
Cleanup of source code: Plan exists, but can only be done after meeting
Decision: Adopt (subject to checking of WD text and software by editor/coordinator)
3. Modifications to Bs calculation process
Currently in HM4.0, Bs decision process for an 8-pel edge consists of finding BS decision for every 4-pel
edge and then a maximum of two 4-pel Bs is found, because the minimum unit of BS and filtering is
different.
The proposal suggests using the Bs of the first 4-pixel part as the Bs for the whole 8-pel edge to reduce Bs
decision operation and remove BS merging process.
The proposal suggests using the Bs of the first 4-pixel part to determine Bs for the whole 8-pel edge. The
average BD-rate change is 0.0% for all configurations.
Results: 0.0/0.0, 0.0/0.0, 0.0/0.0 for HE-AI/LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
One participant supports the proposal.
Recommendation: CE
Part 2. It is proposed to modify strong/weak filter decision by performing the decision for the 4-pel edge
based on one line. The number of strong/weak filter decisions for an edge segment is reduced from eight
to two.
It was claimed by one participant that complexity in hardware does not decrease comparing to testing two
lines like in G590. The complexity in software probably decreases.
Results: 0.0/0.0, 0.1/0.0, 0.1/0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
JCTVC-G090 Non-CE12: Testing results on using HM3.0 delta calculation for luma weak
filter [M. Zhou, O.Sezer, V. Sze (TI)]
This contribution proposes to modify delta calculation for luma weak filter which has higher precision
when compared to HM4.0 one. Simulation results revealed that the proposed modification improved
coding efficiency by 0.2% in AI-HE and AI-LC, and 0.1% in RA-HE, RA-LC, LB-HE and LB-LC. The
proposed change reportedly led to visible visual gain in BQMall (RA-HE, QP=37) sequence and no visual
difference in the other CE12 selected subjective testing sequences.
Results: -0.2/-0.2, -0.1/-0.1, -0.1/-0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
The proponent does not think it is necessary to participate in subjective viewing (since the proposal is
similar to G290). No action? Only one of 290 or 090 in CE
JCTVC-G640 Deblocking bug fix for CU-Varying QP’s and IPCM blocks [A. Kotra, M.
Narroschke, T. Wedi (Panasonic)]
The contribution presents modifications to deblocking operations when CU-based multi-QP optimization
is enabled (Part 1) and IPCM blocks (Part 2).
Part 1:
Five modifications are proposed related to varying QP deblocking.
Individual tc and ß thresholds are derived for the blocks corresponding to an edge on which weak filtering
is performed.
Separate tc values are used in deriving the delta offsets for innermost samples in weak filtering
Separate ß values are used in decision for filtering of outermost samples in weak filtering.
It is proposed to use an average of the QP’s for deriving tc and ß values in filter decision process and
strong filtering.
Experimental results provided for a QP adaptation control that randomly assigns QP to coding units and
to CE4 testing conditions.
Part 2:
It is suggested to use a separate QP (QP_PCM) which is associated to IPCM blocks whenever deblocking
filtering is desired to be performed over IPCM blocks. QP_PCM parameter is transmitted in the slice
header.
Recommendation: see below.
JCTVC-G138 Deblocking of IPCM Blocks Containing Reconstructed Samples [G. Van der
Auwera, X. Wang, M. Karczewicz (Qualcomm)]
This contribution proposes a modification to the HM4.0 deblocking loop filter in case of IPCM blocks
containing reconstructed samples. HM4.0 deblocking filter always assigns quantization parameter value
zero to the IPCM blocks, therefore deblocking filtering is disabled for the left and top edges of the IPCM
blocks.
The proposal is to assign a quantization parameter value to the IPCM block, which is predicted from the
neighboring quantization group.
Recommendation: see below.
G138 proposes deriving QP for IPCM from neighbouring blocks, G640 proposes sending QP in the slice
header for the all IPCM blocks in the slice.
Conclusion IPCM deblocking
- Further study (AHG on loop filter or new one?)
JCTVC-G214 Non-CE8: Constrained ALF coefficients [C.-Y. Chen, C.-Y. Tsai, C.-M. Fu,
Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0, ALF coefficients are unconstrained, which is difficult to decide the bit width of multipliers in
hardware implementations. In this contribution, center coefficients are clipped within [0, 2), non-center
coefficients are constrained within [-−1, 1), and offsets are also clipped within [−-2D, 2D), where D is the
pixel bit depth. It is reported that the constrained coefficient ranges do not cause any coding efficiency
loss or run time change.
Limits the range of integer part of ALF coefficients. Center:[0,2), other:[−-1,1), DC:[−-2D, 2D). To
specify multiplier's bit range. No need for syntax change. Add restriction on encoder/bitstream. No
detailed statistical analysis is available, but more than 99% of coefficients would not be hit by this
restriction. Reduces complexity/runtime
Recommendation from BoG: adoption.
Decision (SW): Adopt
JCTVC-G215 Non-CE8: Limited number of filters per picture for ALF [C.-Y. Chen, C.-Y.
Tsai, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek)]
In ALF of HM-4.0, up to 16 filters per picture are used for luma, and 16 filters may be too many for low
resolution applications. In this contribution, a maximum number of filters per picture can be given on the
encoder side, and the ALF encoder will merge regions or classes accordingly. Simulation results
reportedly show that when the maximum number of filters per picture is 10, no coding efficiency loss is
observed. When the maximum number of filters per picture is six, 0.1% bit rate increase is observed in
LDP. Proper values for the maximum number of filters can be discussed when common test conditions
are defined. Allowed values can be decided when profiles and levels are defined.
Two options: Make it switchable by syntax element, or just define the number by profile/level. (It was
discussed elsewhere that there may be dependency of the required number of filters on picture size, but
nobody is sure about that.)
Recommendation of track A: Have the number of filters as a parameter in the encoder conf. file, with a
default of 16.
JCTVC-G216 Non-CE8: Removing the 15th merge flag for BA mode in ALF [C.-Y. Chen,
C.-Y. Tsai, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek)]
The block-based adaptation (BA) mode of ALF in HM-4.0 classifies blocks into 15 classes. Classes can
be merged, and one filter is used for each class after merging. For class merging, only 14 merge flags are
needed. Therefore, it is proposed to remove the 15th merge flag for BA mode, as a bug-fix.
JCTVC-G218 Non-CE8: One-stage non-deblocking loop filtering [C.-Y. Chen, C.-Y. Tsai,
C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), I. S. Chong, M. Karczewicz
(Qualcomm), T. Yamakage, T. Itoh, T. Watanabe, T. Chujoh (Toshiba)]
In this contribution, SAO and ALF are combined into one stage by dividing one picture into filter units
(FUs) and switching FUs among {ALF, SAO, OFF}, where FUs are LCU-aligned blocks coded in raster
scan in APS and the FU-based syntax is also friendly for low latency. It is reported that the proposed
method achieves 0.1%, 0.1%, 0.0%, and 0.0% bit rate reductions for HE-AI, HE-RA, HE-LDB, and HE-
LDP, respectively, with a picture-based 2-pass encoding algorithm. The proponents request to adopt this
proposal in the next CE8 for studying LCU-based syntax and corresponding low latency encoding
algorithms.
Integrates ALF and SAO in a single processing step (also keeps the options SAO only or ALF only).
Number of filters: up to 16 for the experiment. Gain 0.1/0.1/0.0/0.0, chroma loss (2%). Runtime Enc:
108/102/102/103, Dec: 104/105/105/107. Software is not mature, therefore runtime increase is observed.
Recommend further study in CE.
Signal alf_cu_flag at each cu. This eliminates a buffer to store alf_cu_flag (16kBytes for 4K2K). Coding
results were obtained by an encoder with the current frame-based design of ALF coefficients. This
approach is claimed to be friendly with low latency coding.
Coding gain and runtime unchanged.
Comments by experts: The flags can be stored in external memory, where memory bandwidth increase is
negligible. If this proposal claims friendliness with low latency encoding, encoder should use sub-
optimal filter coefficients that were designed at the previous frames. A question was raised whether the
transmission of bitstream can be started right after an LCU is coded, or needs to wait for a slice to be fully
coded. The approach may not be friendly for parallel processing (wavefront, slice boundary filtering).
No consensus, one non-proponent company supports it, but there are several statements against it. Leave
as is, no action.
JCTVC-G499 Improved ALF with low latency and reduced complexity [A. Fuldseth, G.
Bjøntegaard (Cisco)]
The contribution proposes a low complexity ALF technique with support for sub-frame encoder delay.
One single set of ALF filter coefficients are transmitted for each LCU using single-pass estimation. This
supports low encoder-side delay by allowing for estimation and signaling of the ALF coefficients on the
LCU level without aggregating pixel date of the whole frame. The proposed ALF uses either 5x5
diamond shape or 9x3 cross shape and does not require decoder-side variance calculations. The absence
of decoder-side variance calculations represents a significant reduction in decoder complexity. When
applied to low complexity configurations, BD-rate gains between 1.6 % and 4.0% are reported. When
applied to high efficiency configurations, BD-rate results between -0.2% and 0.4% are reported. For high
efficiency configurations, encoding times vary between 67% and 94%, while decoding times vary
between 91% and 94%.
Improvement of CE8 proposal, G498. At each LCU, a filter is designed (or previously designed filter is
used). A flag to indicate control of filtering is signalled. Suitable for single-pass encoders. Proposed to
adopt this in both HE and LC. Two filter shapes (Snowflake as is, cross-shape with 3 pel vertical height).
Up to 16 coefs. stored for a slice (as a kind of dynamic codebook), one of which is selected for a LCU, or
a new one is designed
The software has been uploaded to the JCT-VC site on Nov. 22.
Luma Bitrate: HE: 0.2/0.4/0.0/-0.2 LC: 1.6/2.5/2.3/4.0 (AI/RA/LBLP)
Runtime: Enc HE: 67/91/94/90 LC: 106/102/101/101 Dec HE: 91/92/94/94 LC: 114/113/111/113
JCTVC-G651 Crosscheck of JCTVC-G499 - Improved ALF with low latency and reduced
complexity [M. Budagavi (TI)] [late]
Remove DC offset of ALF. SAO and ALF DC offset may have overlapping effect. This comes with a
loss of class E (0.4% luma, more for V channel).
Use lower precision to signal DC coefficient (2 bits instead of the current 9 bits).
JCTVC-G446 Reduction of the number of pixels used in Adaptive Loop Filter [K.
Miyazawa, K. Sugimoto, A. Minezawa, S. Sekiguchi, T. Murakami
(Mitsubishi)]
This contribution proposes a method for reducing the number of pixels used in Adaptive Loop Filter
(ALF). Prior to applying ALF, the proposed technique calculates a score predicting the effectiveness of
ALF for each pixel, then, skips all ALF processes (e.g. pixel classification, filter design, filter apply) for
the pixels whose scores are less than a threshold. This threshold is adaptively determined for each frame
in the encoding process, and is transmitted to a decoder. The simulation results report that the proposed
method achieves 2% ~ 9% encoding time reductions and 1% ~ 2% decoding time reductions, with 0.1% ~
0.2% BD-rate loss for AI-HE, RA-HE, and LD-HE structures.
Complexity reduction of ALF by skipping area of ALF pixels by pre-analyzing a cost for 4x4 block.
ALF-skipped pixels is 40%, loss 0.1/0.2/0.2/xx
Comment: alf_cu_flag may do similar thing with more burden.
Helps on average complexity, but hurts worst case (unless we would limit percentage of ALF use which
might incur additional problems). No action.
JCTVC-G463 Block-based filter adaptation with intra prediction mode and CU depth
information [S. Wang, S. Ma, J. Jia (LG)]
This contribution presents a simplification of block-based filter adaptation (BA) scheme for intra slice.
For each 4x4 block, features are obtained from intra picture prediction mode and CU depth. So that filter
adaptation is free from calculation of direction and Laplacian features. The proposed method reports an
average 2% decoding time reduction, with 0.1% BD-rate loss for AI structures in comparison with the
HM 4.0.
Simplification of filter adaptation in BA-ALF for intra slices. Intra prediction modes, CU depth
information and PU sizes are used instead of calculating directional and Laplacian features. Only for intra
slices? Yes. 0.1% loss is reported for intra.
May become complicated to implement a different adaptation process just for intra (additional functions
or circuitry are necessary), whereas the worst case run time must be supported anyway, so saving
computation for intra only may not be helpful. No action.
JCTVC-G666 1D- DCT based frequency domain adaptive loop filter (FD-ALF) [Jeongyoen
Lim, Ju Ock Lee, Hae-Kwang Kim, Joo-Hee Moon]
A frequency domain adaptive loop filtering (FD- ALF) method on the basis of 1D DCT is proposed in
this document. The purpose of this contribution is reducing computational complexity of ALF while
minimizing coding efficiency loss. The basic scheme of FD-ALF follows the current ALF in HM4.0
except that the filtering is applied in 1D DCT frequency domain. FD-ALF is adaptively applied on RA or
BA classification mode and can be controlled on CU unit basis just as same way as the existing ALF.
DCT domain filtering is processed by multiplication operation of DCT domain filter to the DCT
transformed reconstructed picture after SAO rather than the convolution operation of current pixel based
ALF filter. Different 1D DCT filters are characterized by its tap size (8 tap, 16 tap), its direction
(horizontal or vertical) and the coefficients of each of 1D DCT filters. The coefficients of 1D DCT filters
are obtained by MSE (Mean Square Error) optimization method between the original picture and the
reconstructed picture.
Using common conditions, the average bit rate reduction is +1.2% for Y components for high efficiency
AI, +2.2% for Y components for high efficiency RA, +1.8% for Y components for high efficiency LDB,
and +2.8% for Y components for high efficiency LDP. Encoder Time is 83% for high efficiency AI, 97%
for high efficiency RA, 98% for high efficiency LDB, and 96% for high efficiency LDP. Decoder Time is
94% for high efficiency AI, 91% for high efficiency RA, 94% for high efficiency LDB, and 92% for high
efficiency LDP.
Presented Friday afternoon. No bit rate reduction, loss.
No action.
JCTVC-G923 ALF decoding time reduction by adopting a simple SIMD code (Informative)
[T. Yamakage, T. Itoh (Toshiba)] [late]
This contribution informs about decoding time reduction of ALF by adopting a simple SIMD code.
By implementing ALF (BADIR classification and ALF filtering) with a simple SIMD code, the decoding
time is reduced to 9% on average. This SIMD code can be compiled with Microsoft (R) Visual Studio
(R) 2010, and uses Intel (R) SSE4.1. Results were cross-checked by JCTVC-G954.
Information by implementing a simple SIMD code
Process 8 pixels in parallel.
10% less decoding time by SIMD code (i.e., ALF time became 1/3)
Just for information – no action
JCTVC-G222 Non-CE8: Offset coding in SAO [C.-M. Fu, Y.-W. Huang, S. Lei
(MediaTek), I. S. Chong, M. Karczewicz (Qualcomm)]
The offset coding in SAO of HM-4.0 does not fit the offset distribution. In this contribution, a new
codeword design can be used to better fit the offset distribution, or an offset prediction technique can be
used to reduce offset information. Simulation results reportedly show that the luma bit rate and the
chroma bit rate are reduced by 0-0.1%, and 0.1-0.5%, respectively, with unchanged run time.
JCTVC-G818 Cross-check for Offset coding in SAO from MediaTek and Qualcomm by
Samsung [E. Alshina, J.H. Park] [late]
JCTVC-G827 Non-CE8: Crosscheck for Ericsson's modified SAO edge offsets in JCTVC-
G490 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]
JCTVC-G915 Coding and selection of SAO parameters [D. Baylon, K. Minoo (Motorola
Mobility)] [late]
In HM4.0, the determination of the SAO offset value is based upon only distortion considerations. This
contribution proposes to determine the offset based on RD considerations. In addition, this contribution
proposes to increase the number of EO classes to eight. Simulations over 30 frames reportedly show no
significant loss in luma coding for HE and LC, while a bit rate savings of up to 1.2% for chroma HE, and
1.1% for chroma LC.
Contribution proposes to determine the offset based on RD considerations. In addition, this contribution
proposes to increase the number of EO classes from four to eight. Change in coding of BO type. (HE
case)
Short-length results: HE: 0.0/0.0/0.0/0.0 luma LC: 0.0/0.0/0.0/-0.1 luma Chroma: 0.5 to 1.1% gain
Encoder optimization: supported by a participant, but wait for cross-check results.
Adds 8 Band Offset classifications to the 2 current HM4.0 BO groups. In particular, 3 additional
intensity subdivision ranges are added. Worst case complexity does not increase since it comes from EO.
Decrease of decoding time by selecting BO more frequently. New RDO selection for all SAO
classification.
Another test is add 6 Band Offset classification.
Main proposal: HE: 0.0/0.0/0.1/0.1 luma LC: 0.0/0.1/0.2/0.1 luma Chroma: 0.7 to 2.7% gain (average
1.7U, 2.0V)
Additional test: HE: 0.0/0.0/0.1/0.1 luma LC: 0.0/0.1/0.2/0.1 luma Chroma: 0.5 to 2.2% gain (aver.??)
Comments: Chroma only change? No, both luma and chroma for consistency. Signal the center band
for every region.
JCTVC-G828 Non-CE8: Crosscheck for Canon's additional SAO band offset classifications
in JCTVC-G246 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]
JCTVC-G682 Non-CE8: Reduced number of band offsets in SAO [W.-S. Kim, D.-K. Kwon
(TI)]
In the current HM-4.0 design SAO parameters are encoded into adaptation parameter set (APS), and need
to be stored in a buffer until SAO process is completed for each picture. The buffer size is proportional to
the number of SAO partitions and size of SAO parameters for each partition. In this contribution the
The number of band offsets is reduced from 16 to 8, and number of bands is increased from 2 to 4. The
purpose is to reduce the memory to store offsets.
Proposal 1: 4 band, 8 offsets
Proposal 2: The coverage of offsets of the first and the last sub-band are extended to the pixels outside the
band.
Main proposal: HE: 0.0/0.0/0.0/0.0 luma LC: 0.0/0.0/0.1/0.0 luma Chroma: 0.1 to 0.5% gain
Extended proposal: HE: 0.0/0.0/0.0/0.0 LC: 0.0/0.0/0.1/0.0
Comments:
It is suggested that number of offsets could also be imposed by level constraints.
Encoder must perform more depth.
Note: G218 (unified ALF/SAO) might also solve this problem
Conclusion: Conduct CE on G246, G682 (SAO simplifications)
JCTVC-G748 Non-CE8: Crosscheck of TI's Reduced number of band offsets in SAO [T.
Ikai (Sharp)] [late]
For EO types, the fifth offset is employed on pixels which edgeIdx is 2. For BO types, a new band-
classification table is introduced for chroma values.
HE: 0.0/0.0/0.0/xx luma LC: 0.0/0.0/0.0/xx luma Chroma: 0.3 to 0.5% gain in chroma (HE cases).
No action.
In HM-4.0, sample adaptive offset (SAO) parameters are coded for each region in a picture. In order to
support localization of SAO parameters with higher flexibility, this contribution proposed a new syntax
that allowed SAO parameters to be adaptively changed at any largest coding unit (LCU). Simulation
results reportedly showed that the proposed syntax caused 0.0%, 0.1%, 0.1%, 0.1%, 0.0%, 0.1%, 0.2%
and 0.2% bit rate increases for HE-AI, HE-RA, HE-LB, HE-LP, LC-AI, LC-RA, LC-LB, and LC-LP,
respectively, with almost the same encoding and decoding times when the algorithm of deriving localized
SAO parameters was unchanged.
Motivation: It is desirable to develop a simple syntax that can support many different picture partitioning
algorithms for SAO optimization.
Allow SAO parameters to be adaptively changed at any largest coding unit (LCU). SAOP can be
signalled LCU by LCU or can be copied from left or above LCU or above LCU line. Prediction of offset
is also applied.
Loss: HE: xx/-0.1/-0.1/-0.1 LC: xx/-0.1/-0.2/-0.2
Encoder implementation can be (1) same design as HM4 with the LCU-based syntax, or (2) LCU by LCU
level encoder that may lose some coding gain.
Note: G218 also would enable LCU-based adaptation. However, it may be implementation specific
whether combined SAO and ALF is desirable, as some implementations may rather combine de-blocking
and SAO. Therefore, include in same CE part as G218.
JCTVC-G935 Cross-check for LCU based SAO from MediaTek and Qualcomm (JCTVC-
G831) by Samsung [E.Alshina, J.H.Park (Samsung)] [late]
JCTVC-G211 Non-CE8.c.6: Multi-source SAO and ALF virtual boundary processing with
cross9x9 [C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei
(MediaTek), S. Esenlik, M. Narroschke, T. Wedi (Panasonic), I. S. Chong, M.
Karczewicz (Qualcomm)]
In HM-4.0, SAO requires 0.2 luma pixel line buffer (PLB) and 0.2 chroma PLB, and ALF requires 4.1
luma PLBs and four chroma PLBs. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing
was proposed to remove all line buffers for ALF and SAO, respectively. In JCTVC-F272, multi-source
SAO and ALF were proposed to reduce line buffers. In JCTVC-G206, the three prior proposals and using
snowflake5x5 and cross9x9 ALF shapes are combined as CE8.c.4-2 to remove all SAO and ALF line
buffers, and this contribution further improves the visual quality of CE8.c.4-2 without increasing any line
buffer. Due to the DF in HM-4.0, the luma VB and the chroma VB are set as three pixels and one pixel
above the horizontal LCU boundary, respectively, and processing each pixel on one side of a VB avoids
any data access from the other side of the VB unless the data can become available in time without using
any additional line buffer. When compared with the JCTVC-F900 anchor, this proposal reportedly
improves coding efficiency and achieves 0.0%, -0.2%, -0.5%, and -0.2% BD-rates for HE-AI, HE-RA,
HE-LDB, and HE-LDP, respectively, and the same visual quality. No VB artifact is observed.
Already discussed in context of CE8.c
JCTVC-G212 Non-CE8.c.7: Single-source SAO and ALF virtual boundary processing with
cross9x9 [C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei
(MediaTek), S. Esenlik, M. Narroschke, T. Wedi (Panasonic), I. S. Chong, M.
Karczewicz (Qualcomm)]
In HM-4.0, SAO requires 0.2 luma pixel line buffer (PLB) and 0.2 chroma PLB, and ALF requires 4.1
luma PLBs and four chroma PLBs. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing
was proposed to remove all line buffers for ALF and SAO, respectively. In JCTVC-G206, the two prior
proposals and using snowflake5x5 and cross9x9 ALF shapes are first combined as CE8.c.4-1 to remove
all SAO and ALF line buffers, and this contribution provides two solutions to improve the visual quality
of CE8.c.4-1. Due to the DF in HM-4.0, the luma VB and the chroma VB are set as four pixels and two
pixels above the horizontal LCU boundary, respectively, and processing each pixel on one side of a VB
avoids any data access from the other side of the VB. Non-CE8.c.7-1 gives up SAO VB processing and
has SAO line buffers, while non-CE8.c.7-2 applies SAO VB processing method that can reduce 50%
SAO line buffers. Both solutions apply ALF VB processing that can remove all ALF line buffers. When
compared with the JCTVC-F900 anchor, both solutions reportedly improve coding efficiency and achieve
0.0%, -0.1%, -0.3%, and -0.0% BD-rates for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and
the same visual quality. No VB artifact is observed.
Already discussed in context of CE8.c
JCTVC-G220 Non-CE8: Pure VLC for SAO and ALF [C.-Y. Tsai, C.-M. Fu, C.-Y. Chen,
C.-W. Hsu, Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0-dev-miscs, SAO and ALF parameters in APS and CU-level ALF-on/off flags in slice header
can be coded by CABAC. No other syntax elements in APS and slice header can be coded by CABAC. In
this contribution, it is proposed to use pure VLC for SAO and ALF and to remove byte alignment bits for
CABAC in APS. Simulation results reportedly show 0%, 0.1%, 0.2% and 0.2% coding efficiency gains
for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, when the APS coding is changed from CABAC
to VLC. Simulation results also show no coding efficiency impact when the slice header coding is
changed from CABAC to VLC.
Use pure VLC for SAO and ALF and to remove byte alignment bits for CABAC in APS.
Use pure VLC for APS 0.0/0.1/0.2/0.2 Use pure VLC for slice header 0.0/0.0/0.0/0.0
From BoG: Many participants support this.
JCTVC-G617 Non-CE8: Cross check of MediaTek's pure VLC for SAO and ALF [I. S.
Chong, M. Karczewicz (Qualcomm)]
JCTVC-G978 Cross check of syntax refinements for SAO and ALF in JCTVC-G566 [J.
Tanaka, T. Suzuki (Sony)] [late]
If an SAO offset is known, SAO process is simply an add operation, it would be desirable to be added to a
pixel while the pixel is deblocked.
Use horizontal deblocked pixels instead of fully deblocked pixels for EO type SAO classification. For the
offset adding step, the offset is still added to the fully deblocked pixels. For BO type, no modification
needed.
Bitrates (losses) HE: -0.1/-0.1/0.0/xx LC: -0.1/-0.1/-0.1/xx
Runtime Enc HE: 99/99/100/xx LC: 100/100/99/xx Dec HE: 100/102/104/xx LC: 101/101/102/xx
Can this be implemented as one pass also at the encoder? Must be studied.
Some concern that this may complicate the encoder, whereas the benefit seems to be unclear. No action.
JCTVC-G1011 Cross-check results of the Unified Deblocking and SAO of Sharp (JCTVC-
G608) [M. Narroschke, S. Esenlik (Panasonic)] [late]
JCTVC-G684 Subjective Tests on ALF and SAO Using HM-4.0 [W.-S. Kim, O. G. Sezer,
M. Budagavi (TI)]
This contribution presents results of informal subjective tests conducted on two loop filtering operations
in HM-4.0: SAO and ALF. Test videos were presented at their actual frame rates to the viewers. Three
configurations were tested using low delay B high efficiency (LB-HE) condition: HM-4.0 Anchor vs.
SAO-off+ALF-off, HM-4.0 Anchor vs. SAO-off+ALF-on, and HM-4.0 Anchor vs. SAO-on+ALF-off. In
HM-4.0 Anchor, both SAO and ALF are activated (SAO-on+ALF-on). Among these three configurations,
SAO-on+ALF-off gives subjective quality results better than or comparable to the HM-4.0 Anchor. This
contribution requests that JCT-VC conducts subjective quality test of SAO and ALF in a core experiment
setting to evaluate the subjective quality gains provided by the aforementioned tools, and decide on the
JCTVC-G519 Non-CE2: Harmonization of implicit TU, AMP and NSQT [X. Zheng
(HiSilicon), Y. Yuan, Y. He (Tsinghua)]
This contribution provides a harmonization solution of implicit TU, AMP and NSQT. Experimental
results show that the proposed solution contributes average coding gain of 0.1% for RA, 0.1% for
RA_LC, 0.2% for LD_B, 0.1% for LD_B_LC, 0.1% for LD_P and 0.1% for LD_P_LC. Both encoder and
decoder complexity are same as HM4.0.
Decision: It is agreed that this harmonization is desirable. WD and software should be checked by WD
editor and Motorola (K. Panusupone). (confirm)
This contribution provides a non-square hadamard transform solution which is used at motion estimation
and merge estimation process. Different configurations of non-square hadamard transform are also
discussed at the contribution. Experimental results show that non-square hadamard can achieve the
average gain of 0.2% for RA, 0.2% for RA_LC, 0.2% for LD_B, 0.3% for LD_B_LC, 0.2% for LD_P
and 0.3% for LD_P_LC when inter 2x8 and 8x2 transform is used at residual coding.
Presentation not uploaded.
Decision (SW): Adopt non-normative tools (but not the inclusion of 2x8/8x2 as also suggested in the
docnot 2x8/8x2)
JCTVC-G151 Prediction and partition mode binarization for Low Delay P [X. Zhang, S.
Liu, S. Lei (MediaTek)]
This contribution reports a bug fix and a method for binarizing prediction and partition modes for Low
Delay P configuration. Firstly, a mismatch was found between WD and HM with regard to the prediction
and partition mode binarization for Low Delay P. With the bug fix in HM software, negligible (0.0%)
impact is reported on both BD-rate and encoding and decoding runtime. Secondly, it is proposed to unify
the prediction and partition mode binarization for Low Delay P and B, which simplifies both WD and
HM. Again, experimental results report negligible (0.0%) impact on both BD-rate and encoding and
decoding runtime.
Partially already discussed - see under G785.
Deviation between WD and software in binarization for P
JCTVC-G283 Residue Quad Tree Depth for Chroma in Intra Coding [X. Zhao, X. Guo, X.
Li, S. Lei (MediaTek), S. Ma, W. Gao (PKU)]
In HM4.0, luma and chroma components share the same maximum Residue Quad Tree (RQT) depth.
This contribution proposes to use separate maximum RQT depth for luma and chroma components in
intra coding. In specific, a user-defined parameter is added in the sequence parameter set (SPS) to allow
independent RQT depth setting for chroma. It is reported that with setting the parameter as 0, average
BD-Rate reductions of 0.04%, 1.67% and 1.70% are achieved for Y, U and V in AI-HE, respectively, and
0.20%, 2.43% and 2.53% are achieved for Y, U and V in AI-LC, respectively. It is also reported that the
decoding time is slightly decreased.
Question: Is there an implicit mode for chroma? Currently not.
JCTVC-G798 Cross verification of MediaTek’s proposed residue quad-tree depth for intra
chroma coding (JCTVC-G283) [Y. Chiu, L. Xu (Intel)] [late]
JCTVC-G062 non-CE3: 7-tap quarter-pel luma interpolation filter with accurate phase
shift [Hongbo Zhu]
A 7-tap 1/4-pel luma interpolation filter is proposed in this document. The simulations were
conducted under the common test condition [1] using the HM4.0 r1354. When combined with the
6-tap DCT-IF 1/2-pel filter {2,-9,39,39,-9,2}, the performance of the proposed filter is 0.0% for
he_ra, 0.0% for lc_ra, 0.0% for lb_he, -0.4% for lb_lc, 0.2% for lp_he and -0.3% for lp_lc in
bdbitrate. Basically, the 6H7Q filter shows gain for the high resolution sequences and shows loss on
the low resolution sequences (WVGA and WQVGA). When the half-pel filter is changed to DCT-
IF 8-tap {-1,4,-10,39,39,-10,4,-1}, the performance is -0.1% for he_ra, -0.2% for lc_ra, -0.2% for
lb_he, -0.7% for lb_lc, -0.1% for lp_he and -0.7% for lp_lc.
Major deviation from current design.
No cross-check
JCTVC-G392 Non-CE3: Report on a restriction for small block [K. Kondo, T. Suzuki
(Sony)]
This contribution reports the results of restriction for small block. To cut worst case complexity, this
contribution tested three restrictions that small PU size is restricted without decoder change. The case-A
is that PU size 8x4, 4x8 and 8x8 for bi-prediction are restricted. The case-B is that PU size 8x4 and 4x8
are restricted. The case-C is that 8x4 and 4x8 for bi-prediction are restricted. With the restriction, it is
shown that the impact of coding efficiency is 2.2%, 1.4%, and 0.3% for case-A, B, and C.
Analysis shows that for large picture resolutions the loss by restricting the PU size is much less even for
case A (e.g. class A only 1%). Case A allows bandwidth reduction of around 50%. This indicates it is
very likely that small PU size restrictions for high resolutions could be meaningful.
Further study in AHG (see under CE3)
JCTVC-G600 Non-CE3: Adaptive Motion Vector Resolution based on the PU Size [J.
Jung, J. Heo, S. Yea (LG)]
This contribution proposes an adaptive mechanism for threshold selection at a PU level in the Progressive
Motion Vector Resolution (PMVR) method proposed by MediaTek. The result shows it improves the
coding efficiency of the PMVR method thanks to its PU-level adaptation of threshold values. This
contribution presents the result of the proposed scheme implemented on the PMVR method without 1/8
MV resolution. The Y BD-rate gains with respect to HM4.0 were -0.2% for RA HE, -0.2% for RA-LC, -
0.1% for LB-HE, -0.2% for LB-LC, 0.0% for LP-HE, and 0.0% for LP-LC. The Y BD-rate gains with
respect to the PMVR without 1/8-pel with Th=2 were 0.0% for RA HE, 0.0% for RA-LC, 0.0% for LB-
HE, 0.0% for LB-LC, -0.1% for LP-HE, and -0.1% for LP-LC.
Expectation that finer resolution of MVs is rather for large PU sizes.
The statistics plot shown may not indicate that this assumption is justified
Currently very small gain only for LD P.
No action.
JCTVC-G699 Motion vector scaling for non-uniform interpolation offset [J. Lou, K.
Minoo, L. Wang (Motorola Mobility)]
Non-uniform motion vector grid was proposed in the last Torino meeting. This contribution document
addresses the problem of motion vector scaling for non-uniform motion vector grid, since reusing the
motion vector scaling for uniform motion vector grid might give slightly different motion vector
predictors. Cross-check will be provided by Samsung. The attached spreadsheet contains detailed data of
the results.
Informative - no action required.
JCTVC-G931 Non CE3: Cross-check for memory band-width reduction from Toshiba (G-
770) by Samsung [E. Alshina, J.H. Park (Samsung)] [late]
JCTVC-G806 Non CE3: On the phase offset selection for motion compensation
interpolation filters [K. Minoo, D. Baylon, J. Lou]
In this contribution 4 sets of filters are introduced and used to conduct motion compensation with quarter-
pixel motion resolution (i.e. four level sub-pixel signaling and storing). The choice of filter set is decided
based on the sub-pixel information of the motion vector predictor (stored at quarter-pel resolution).
Overall 0.82% gains was observed for Luma. (0.2% gain was also observed for each of the Chroma
components)
Presentation not uploaded. Various graph plots are shown that are not included in the contribution to
motivate the idea, but are difficult to relate to the word file.
Selection of phase offset for each sub-pixel position is somehow based on the conditional distribution of
MV for a given MVP (including such positions as e.g. 3/16), but it is not explicitly said how.
Number of filters increased to 9. Main gain in LD P, in the other cases it is typically 0.3%.
Some interest expressed by cross-checker. No action.
A.7 MV scaling
JCTVC-G223 Non-CE9: Division-free MV scaling [T.-D. Chuang, Y.-W. Chen, J.-L. Lin, C.-Y. Chen,
Y.-W. Huang, S. Lei (MediaTek)]
JCTVC-G541 Non-CE9: Simplified scaling calculation method for temporal/spatial MVP of
AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
JCTVC-G551 Restriction on motion vector scaling for Merge and AMVP [I.-K. Kim, Y. Park, N.
Shlyakhov, J. H. Park (Samsung)]
Decision: Adopt the change of scaling factor clipping range to a value of 16 (one of the options suggested
in G223)
A.8 AMVP
JCTVC-G182 Non-CE9: AMVP syntax for bi-prediction [H. Takehara, S. Fukushima (JVC Kenwood)]
JCTVC-G516 On Spatial MV Prediction [K. Sato (Sony)] [late]
JCTVC-G710 Non-CE9: The Parallel Friendly MVP Candidate Calculation for HEVC [Y. Yu, K.
Panusopone, L. Wang (Motorola Mobility)]
JCTVC-G712 Non-CE9: The Simplification of MVP for HEVC [Y. Yu, K. Panusopone, L. Wang
(Motorola Mobility)]
JCTVC-G219 Non-CE9: Construction of MVP list without using scaling [H. Nakamura, S. Fukushima
(JVC Kenwood)]
JCTVC-G542 Non-CE9/Non-CE13: Simplification on AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
Decision: Adopt simplification 2 from G542, no action on G219
B) Parallel merge
JCTVC-G164 Non-CE9: improvement on parallelized merge/skip mode [Y. Jeon, S. Park, B. Jeon (LG)]
JCTVC-G387 Non-CE9 Parallel Merge/skip Mode for HEVC [X. Wen, O. Au, W. Dai, C. Pang, J. Dai,
F. Zou, X. Zhan (HKUST)]
JCTVC-G416 CU-based Merge Candidate List Construction [H. Y. Kim (ETRI), K. Y. Kim, S. M. Kim,
G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
Conclusion: See under G164
C) Coding efficiency improvements
JCTVC-G165 Non-CE9/Non-CE13: new MVP positions for merge/skip modes and its combination with
replacing redundant MVPs [Y. Jeon, S. Park, B. Jeon (LG), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei
(MediaTek)]
JCTVC-G195 Non-CE9/13: Averaged merge candidate [S. Shimada, K. Kazui, J. Koyama, A. Nakagawa
(Fujitsu)]
JCTVC-G305 Non-CE9: Bi-prediction for low delay coding [Y. Suzuki, A. Fujibayashi (NTT
DOCOMO)]
JCTVC-G343 Non-CE9: Improvement in temporal candidate of merge mode and AMVP [N. Zhang, X.
Fan, S. Ma, D. Zhao (Harbin Inst. Tech.)]
JCTVC-G224 Non-CE13: Multiple-scaled merging candidates [H. Nakamura, S. Fukushima (JVC
Kenwood)]
JCTVC-G787 Non-CE13: Additional merge candidates with MV dependent offsets [T. Lee, J. Chen, J. H.
Park (Samsung), G. Laroche, P. Onno (Canon), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
Conclusion: No action
Further details and dispositions under the individual documents as follows (not per category)
JCTVC-G387 Non-CE9 Parallel Merge/skip Mode for HEVC [X. Wen, O. Au, W. Dai, C.
Pang, J. Dai, F. Zou, X. Zhan (HKUST)]
The current HEVC merge/skip is copying motion parameters to the current PU from a candidate list
which consists of spatial and temporal neighbouring PUs. However, it is hard for parallel encoding and
decoding due to the data dependency. Furthermore, different shapes and position of PU would result in
different definition of candidate lists, this would lead to potentially extra hardware cost and not easy to be
efficiently implemented by the hardware. In this proposal, we propose a high level syntax to signal
parallel depth of merge/skip mode and divide a LCU into non-overlapped square merge regions (MRs).
All the PUs located inside the same MR use same candidate list of PUs at both encode and decoder side.
By doing this, all the PUs in the same MR can be checked by certain architecture in parallel. Simulation
results reveal that an average loss of 1.5%, 1.5%, 2.4% and 2.5% in RA-HE, RA-LC, LB-HE and LB-LC
for 32x32 block level parallel ME when compared to the current HM4.0 design. Provides different trade-
off points between saving logic and coding efficiency.
Slides not uploaded
To be considered dependent on CE9 MRG_PAR (JCTVC-F069) discussion.
Improvement, it is basically the same as JCTVC-G164.
JCTVC-G165 Non-CE9/Non-CE13: new MVP positions for merge/skip modes and its
combination with replacing redundant MVPs [Y. Jeon, S. Park, B. Jeon
(LG), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution proposes new MVP positions for skip and merge mode. Two new spatial MVP
candidate positions (A2 and B3) are introduced in addition to the spatial MVP positions (A0, A1, B0, B1,
and B2) of the current HM. The proposed order for spatial MVP candidates is {A2, B3, A1, B1, B0, A0,
A2} but there are some restrictions for adding the candidates to the MVP list in order to minimize the
complexity which can be caused by introducing the two new candidates. Simulation results revealed that
the proposed method achieves 0.1% gain for RA configurations and 0.2% gain for both LB and LP
configurations without any increase in encoding and decoding time.
Slides not uploaded
Improvement, adding two spatial merge candidates.
Increases complexity with small gain. Several experts express negative opinions. No action.
JCTVC-G681 Non-CE9: Simplified Merge candidate derivation [Y. Zheng, X. Wang, W.-J.
Chien, M. Karczewicz (Qualcomm)]
This contribution proposes a change to the rules currently used in determining merge candidates for a PU
under 2NxN, Nx2N, NxN, or AMP mode. In HM4.0, when a current PU under these partition modes is
not the first PU in a CU, the motion information of each of its merge candidates is compared with that of
a previous PU to avoid a situation that a number of PUs share the same motion information so that the
current prediction information can be classified into a mode with less partitions. For example, every PU
has the same motion information under a mode other than 2Nx2N. This contribution proposes to remove
such comparison. The proposed changes reduce the operations and also enable parallel merge candidate
generation.Merge redundancy check removal
Remove the possibly redundant candidate may improve parallelism because PU1 does not depend on PU0
anymore.
Simplification on merge partition redundancy removal
JCTVC-G396 Non-CE9: swapping of merge candidate [C. Kim, Y. Jeon, B. Jeon (LG)]
In this contribution, two methods for reordering the merge candidates are presented. The first method is
applied only for 2Nx2N PUs of square shape. The first method (Proposed method 1) swaps A1 and B1
candidate order in the merge list if the propose condition is satisfied. It is reportedly shown in the
experimental results that 0.1~0.2% BD rate reduction is achieved without increasing encoding and
decoding time with the proposed method 1. The second method (Proposed method 2) is applied for the
second PUs of 2NxN, 2NxnU, 2NxnD, Nx2N, nLx2N and nRx2N partitions of rectangular shape. The
proposed method 2 uses the MVP candidate which belongs to the first PU of those rectangular partitions
for creating the combined bi-pred. candidates even though this candidate shall not exist in the initial list
due to the avoiding check operation. It is reportedly shown in the experimental results that 0.1~0.2% BD
rate reduction is achieved without increasing encoding and decoding time with the proposed method 2. In
addition, the proposed methods are tested with the new anchor in which the avoiding check operation is
removed. It is reportedly shown from the simulation results that the proposed 2 achieves 0.0~0.1% BD
rate saving without encoding/decoding time increase relative to the new anchor and the combination of
the proposed method 1 and the proposed method 2 achieves 0.1~0.2% BD rate saving. The encoding and
decoding time is almost same as the anchor.
Slides not uploaded
Title was changed to “Non-CE9: reordering of merge candidate” after first registration.
It was mentioned that there is a dependency between first and second PU.
Improvement Method1
Simplification Method2
JCTVC-G223 Non-CE9: Division-free MV scaling [T.-D. Chuang, Y.-W. Chen, J.-L. Lin,
C.-Y. Chen, Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0 motion vector (MV) scaling for the derivation of spatial and temporal motion vector predictors
(MVPs), a division operation is required to derive the scaling factor. In hardware and many DSP-based
platforms, dividers are undesirable because of larger gate counts and more processing cycles. In this
contribution, a division-free MV scaling is proposed to replace the general divider by a look-up table and
simple arithmetic operations. Moreover, the effective scaling range is doubled to deal with reference
pictures of longer temporal distances. Simulation results reportedly show that the proposed method has no
bit rate increase in random access conditions and even 0.1% bit rate reduction in low delay conditions.
The proposed design is also implemented in Verilog and synthesized with TSMC 40nm process. The
synthesis results reportedly show that the gate count of the MV scaling module is reduced by 54-58%.
Proposal 1: Extending the clipping range in the scaling factor clipping
Proposal 2: Division free MV scaling by using a LUT
HEVC uses the same scaling as AVC. The implementation of this HM4.0 scaling the proposal is
compared to may also be table-based. Additional slides were shown with comparing this with a tabled-
based HM4.0 scheme. [to be uploaded]
JCTVC-F068 reports that scaling needs 3 cycles. That is not affected by the tested proposal.
Adopt proposal (see below). No support for proposal 2JCTVC-G523 Non-CE9: Cross-check of
division-free MV scaling [P. Onno (Canon)] [late]
Divsion-free scaling is confirmed.
Clipping range extension was not verified.
JCTVC-G305 Non-CE9: Bi-prediction for low delay coding [Y. Suzuki, A. Fujibayashi
(NTT DOCOMO)]
In this contribution the additional option of bi-prediction for low delay conditions and key pictures of
random access conditions is proposed. In the proposed method, the motion vector difference for List 1 is
not signaled and it sets to (0, 0) when POCs of all reference pictures are smaller than that of a target
picture. The average Y BD-rate gains for low delay B condition and random access condition are 0.9%
and 0.2%, respectively.
Encoder only change (setting lis1 mvd to zero and signalling it) give 0.3% and 0.1% gain on average for
LB and RA. (available in revised document)
There is no list 1 motion vector search for bi-prediction.
It was asked what would be the impact when the list1 motion vector search for bi-prediction is enabled at
the encoder. The assumption is that if the list 1 estimation is better the loss of setting the list 1 mvd to
zero is increasing.
Interesting candidate to be considered after CE9/13 conclusion is reached.
Improvement
JCTVC-G592 Non-CE9: Removal of reference index derivation for TMVP in merge mode
[O. Bici, J. Lainema, K. Ugur (Nokia)]
This was already discussed in several other documents JCTVC-Gxxx
JCTVC-G733 cross check for I2R AMVP simplification [Yue Yu, Krit Panusopone, Limin
Wang] [late]
Simplification 1 confirmed
JCTVC-G710 Non-CE9: The Parallel Friendly MVP Candidate Calculation for HEVC [Y.
Yu, K. Panusopone, L. Wang (Motorola Mobility)]
This contribution proposes a parallel friendly MVP candidate calculation when a CU consists of two PUs.
Simulation results show that there is a loss of 0.3% for random access condition and a loss of 0.1% for
low delay condition compared to original AMVP while it is possible to parallel process two PUs.
Slides not uploaded
Results are reported for simplification combined with improvement.
Simplification of AMVP replace PU0 candidate for PU1 by one outside PU0
JCTVC-G712 Non-CE9: The Simplification of MVP for HEVC [Y. Yu, K. Panusopone, L.
Wang (Motorola Mobility)]
This contribution proposes a simplification of MVP design for HEVC. Simulation results show that a loss
of 0.1% for LBHE and a loss of 0.3% for RAHE are observed compared to original AMVP while the
complexity of the proposed method is reduced at least 50%, and up to 66.7% as compared to the MVP
selection procedure of AMVP when a CU consist of two PUs.
Slides not uploaded
Simplification of AMVP where PU1 MVP is derived from PU0 MVP
Concern was raised on the additional scaling.
It cannot be combined with JCTVC-G710.
Losses of 1.1% for SteamLocomotive.
Does not look like a real simplification. No support for this
JCTVC-G787 Non-CE13: Additional merge candidates with MV dependent offsets [T. Lee,
J. Chen, J. H. Park (Samsung), G. Laroche, P. Onno (Canon), J.-L. Lin, Y.-
W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution presents additional Merge/AMVP candidates to compensate empty positions in the
fixed length candidates list. The additional Merge/AMVP candidates are produced by adding offsets to
the first existing candidate where the offset value is chosen according to the first candidate’s MV value.
Relative to the HM4.0+MRG_ENC_FIX which is reported in JCTVC-G776, experimental results showed
-0.3% BR savings in average of results in RA-HE, RA-LC, LB-HE, LB-LC, LP-HE, LP-LC.
Combination of CE13 tests. Should be revisitedWas discussed again when thea conclusion on CE9/13
tests wais reachdiscussed (see notes elsewhere).
A concern was raised regarding losses in chroma and luma for class E sequences. This may be dues to
removing zero motion vector merge candidates.
The fast encoding method can also be applied to other proposals.
JCTVC-G209 Modified method for coding mvd in the CABAC mode [S.-T. Hsiang, S. Lei
(MediaTek)]
This contribution proposes a modified method for coding the absolute value of each component of vectors
mvd_l0 or mvd_l1 in the CABAC mode. The proposed method attempts to further utilize context
modeling for coding the EG 1 prefix bins. Experimental results reportedly show Y BD-rate gains 0.1%,
and 0.0% for HE-RA and HE-LDB, respectively, under the common test conditions and Y BD-rate gains
0.3%, 0.1% for HE-RA and HE-LDB, respectively, over QP = 32, 37, 42, and 47.
Increases complexity. No significant gain.
No interest expressed
JCTVC-G705 CABAC simplification for explicit signaling mode of AMVP [K. Misra, A.
Segall (Sharp)]
This document proposes to remove the CABAC context models for the motion vector predictor index,
mvp_idx_lX (mvp_idx,l0, mvp_idx_l1, mvp_idx_lc), and use CABAC bypass mode instead. It is asserted
that this change reduces the number of CABAC contexts in memory and eliminates the associated
CABAC context update step while having negligible impact on compression efficiency. For HM-4.0, high
efficiency common test conditions, the proposed change shows an average BD bitrate impact of –
(Without Class F sequences)
RA_HE Y:0.0% U:0.0% V:0.0%; LB_HE Y:0.0% U:0.1% V:0.1%; LP_HE Y:0.0% U:0.2% V:-0.1%.
JCTVC-G785 Unified Pred_type coding in CABAC [Y. Piao, J. Min, J.H. Park (Samsung)]
This document proposes to unify pred_type coding of slice B and P in CABAC. Prediction mode and
partition mode coding in slice B is modified to be same as that of slice P to simplify the pred_type coding
and enhance the readability of specification. The average BD-rate by this unification is 0.01% in RA and
0.03% in LD high efficiency configuration. With bug fix reported in JCTVC-G655, the average BD-rate
is 0.xx% in RA and 0.xx% in LD.
No results yet on bug fix
In software implementation, there are some restrictions on B slice. There are also some mismatches
between text and software for P slices, so this one would introduce this bug in B as well.
G151 is related (does it the other way round) (both were mutually cross-checkers), also G718.
(there was the intention that intra would be more probable in P than NxN and 2Nx2N, which is the only
difference)
Disposition:
- We want only one solution
- Side activity (joint with WD and software coordinator, see result in JCTVC-G1042)
JCTVC-G149 Options for High-Level Syntax for Multistandard Scalability [S. Wenger
(Vidyo)]
(Submitted as an information document.)
Discussed are options for high level syntax for multistandard scalability, with a focus on the case of an
AVC or SVC base layer, and one or more spatial/SNR HEVC enhancement layers, and with RTP
transport/multiplexing. A few remarks are included with respect to other standards and other
transport/multiplexing schemes. It is the author’s belief that multistandard-scalability can be enabled with
only moderate increases in high level syntax complexity.
In AVC there are 6 remaining “reserved” (falling into two categories of allowed NAL unit ordering
behaviour) and 9 “unspecified” (at least 6 of which have been used by others).
The emphasized potential approach is to use external multiplexing (e.g. through RTP), although this may
require a relatively tight design coupling with specification of external interface points.
It was noted that the NAL unit type design for HEVC uses a different number of bits than AVC.
JCTVC-G079 SEI message for display orientation information [J. Boyce, D. Hong, S.
Wenger (Vidyo)]
This contribution proposes an SEI message for describing display orientation information, to be included
in an amendment to include in the HEVC design (and in AVC, although that is outside the scope of JCT-
VC). The proposed SEI message indicates to the renderer a request to rotate and/or flip the decoded
picture for proper display, after the normal decoding process. Because handheld video capturing devices
allow changing the picture capture orientation dynamically, using an SEI message allows dynamic
changes to the picture display orientation, temporally aligned with the compressed video data.
The same proposal is being made to MPEG as m21659, and to VCEG as T09-SG16-C-0690, for
consideration as an amendment to AVC. The concept was originally proposed in contribution JCTVC-
E280.
It was asked why the semantics are in units of degrees, and in whole-integer units in particular, and why
this is variable-length coded. The contributor indicated that the VLC coding was an error, and that there
was no particular need for whole-degree units.
A participant remarked that the persistence syntax should be checked.
Another participant expressed potential interest in three-dimensional coordinates rather than just rotation
(and flip).
It was noted that there was a “shall” in the proposed text, which was indicated to be an error.
Our plan of action is to wait to see what the parent bodies do with this in the AVC context and coordinate
the outcome.
JCTVC-G092 AHG22: High-level signaling of lossless coding mode in HEVC [M. Zhou
(TI)]
Efficient lossless coding is asserted to be required for real-word applications such as automotive vision,
video conferencing and long distance education. This contribution proposes signalling methods to enable
lossless coding at picture, region and LCU levels. Specifically, sps_lossless_coding_enabled_flag and
pps_lossless_coding_enabled_flag are defined in SPS and PPS to signal whether a group of pictures are
encoded losslessly; if pps_lossless_coding_enabled_flag is not set in PPS,
aps_lossless_coding_enabled_flag is defined in APS to indicate whether there are regions in a picture are
encoded losslessly, if yes, the lossless coding region information is encoded into in APS; if
aps_lossless_coding_enabled_flag is not set in APS, slice_lossless_coding_enabled_flag is defined in
slice_header() to signal whether 1-bit lossless coding flag is present at LCU level. This three-level
signaling method provides a flexible way of signaling lossless coding for different use cases.
This proposal focuses only on high-level syntax for support of this functionality. (Obviously, that is only
desirable if the low-level syntax also supports this functionality.)
Further study of lossless coding has already been (and remains) encouraged, and we have had AHG22 to
investigate this topic.
JCTVC-G583 Reducing output delay for "bumping" process [J. Samuelsson, R. Sjöberg
(Ericsson)]
(Discussion chaired by J. Boyce.)
This document proposes to add a flag to the slice header called output_process_flag and that the
“bumping” process is replaced by an output process invoked based on the value of output_process_flag of
JCTVC-G325 AHG15: Picture size signaling [Y. Chen, Y. -K. Wang, M. Karczewicz
(Qualcomm)]
The current HEVC WD signals the decoded picture size, for both width and height, in luma samples. In
this document, it is proposed that to signal the coded picture size is in units of LCUs, and in addition to
signal the offset between the coded picture size and the decoded picture size, in units of SCUs.
Furthermore, this document raises a discussion on the value range of picture sizes.
Remarks:
What about the cropping window? That would still be used on top of this, as in AVC.
It was suggested that if the picture size is signalled in LCU and SCU units, those should be
signalled before the parameters that depend on them.
It was noted that the proposed signalling, which would require computations involving several
variables to determine the width of the picture, seems undesirably obtuse. For now, let’s keep the
current syntax element pic_width_in_luma_samples. Decision: However, the width and height
should be coded using ue(v) rather than u(16). No range needs to be directly specified, as this
would be a profile/level constraint.
The current software requires the image to be an integer multiple of the SCU size. The cropping
window is presumed to apply as in AVC to support other picture sizes.
We note that this means that the actual width of an encoded picture to support a particular image
width in luma samples (e.g., a picture 14 samples wide) becomes a function of the selected SCU
size, which seems undesirable.
Where should be the boundary for picture extrapolation for motion compensation purposes?
The subject needs further thought.
JCTVC-G566 Syntax refinements for SAO and ALF [S. Esenlik, M. Narroschke, T. Wedi
(Panasonic)]
(Initial discussion in Track A.)
The Adaptation Parameter Set (APS) syntax structure has been adopted during the 6 th JCT-VC meeting in
Torino to be used in the conveyance of parameter sets of Adaptive Loop Filter (ALF) and Sample
3 issues related to APS and ALF/SAO syntax are addressed. The problems stated in this contribution are
resolved by three minor modifications in the slice header syntax structure and one minor modification in
the alf_cu_control_param syntax structure.
(1) Introduction of new flags in slice header (ALF_flag and SAO_flag). ALF_flag eliminates dependency
of decoding APS and slice header. (2) ALF_CU_control_flag is coded by fixed code.
1500Byte slice: HE: xx/0.0/0.1/xx LC: xx/0.0/0.0xx
If higher Qp is used, 0.1% gain is observed.
Comments from BoG: Worth to have ALF_flag in slice header (as the loss of this flag would affect
parsing). Support not using CABAC on ALF_cu_control_flag. SAO_flag might be useful in slice header
when previous frame’s APS parameters shall be re-used, but SAO flag in slice header shall overwrite the
previous APS. As this relates to concepts of using APS parameters, this relates to high-level syntax
(APS) discussion.
Decision: Adopt duplicate ALF_flag in slice header (value of duplicate flag must always match), discard
using CABAC for ALF_cu_control_flag.
Further study on duplicate SAO flag in slice header, as this may be conditional on decision on APS
concepts (re-use of previous SAO). May be discussed in track B.
(Further discussion in Track B.)
To Wenger BoG.
JCTVC-G122 VLC for high level syntax (ALF and SAO parameters) [V. Sze (TI)]
To Wenger BoG.
JCTVC-G606 Crosscheck - VLC for high level syntax (ALF and SAO parameters) (G122)
[T. Nguyen (Fraunhofer HHI)] [late]
To Wenger BoG.
JCTVC-G334 AHG15: On sequence parameter set and picture parameter set [Y. -K.
Wang, Y. Chen, Y. Zheng, W. -J. Chien (Qualcomm)]
This document includes some discussions on some SPS and PPS syntax elements, on their value ranges,
syntax element coding, and/or semantics.
Value ranges were discussed. It is clear that value ranges need to be specified, although it may be
appropriate for the range to be specified in the Profile/Level definitions rather than in the semantics
section in some cases.
Regarding the part of the proposal on a PPS-level flag to disable the temporal MV predictor, the
contributor indicated that it is not necessary to further consider this.
5.9.4 Tiles
BoG [M. Horowitz] to discuss tiles and wavefronts.
JCTVC-G968 A Cross-check report for JCTVC-G183 on low delay tile [K. Misra, A. Segall
(Sharp)] [late]
JCTVC-G194 AHG4: Non-cross-tiles loop filtering for independent tiles [C.-Y. Tsai, C.-W.
Hsu, C.-Y. Chen, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), A. Fuldseth
(Cisco)]
Same concept as disabling filtering across slice boundaries as adopted in Daegu in response to JCTVC-
D128.
Decision: Adopted (as recommended by BoG, see G1025).
JCTVC-G197 AHG4: Low latency CABAC initialization for dependent tiles [C.-W. Hsu,
C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This proposal included not only CABAC initialization, but also requiring entry point signalling for
dependent tiles. (Entry points were already present for independent tiles.)
Decision: Adopted (as recommended by BoG, see G1025).
JCTVC-G315 AHG4: Unification of picture partitioning schemes [M. Coban, Y. -K. Wang,
M. Karczewicz, Y. Chen, I. S. Chong (Qualcomm)]
JCTVC-G317 AHG4: Dependency and loop filtering control over tile boundaries [Y. -K.
Wang, Y. Chen, I. S. Chong, M. Coban, M. Karczewicz (Qualcomm)]
JCTVC-G318 AHG4: Tile groups [Y. -K. Wang, Y. Chen, M. Coban, M. Karczewicz
(Qualcomm)]
JCTVC-G454 Parallel processing of ALF and SAO for tiles [K. Sugimoto, A. Minezawa, S.
Sekiguchi (Mitsubishi)]
JCTVC-G199 AHG4: Wavefront tile parallel processing [C.-W. Hsu, C.-Y. Tsai, Y.-W.
Huang, S. Lei (MediaTek)]
JCTVC-G722 Harmonization of entry points for tiles and wavefront processing [A. Segall,
K. Misra (Sharp)] [late]
JCTVC-G331 AHG 15: On NAL unit types and slice types [Y. -K. Wang, Y. Chen
(Qualcomm)]
At the previous JCT-VC meeting, it was agreed to change nal_ref_idc (2 bits) to nal_ref_flag (1 bit), and
change nal_unit_type from 5 bits to 6 bits. Consequently, the total number of hypothetically possible
NAL unit types doubled from 32 to 64. This document proposes an allocation of the 64 values to different
categories of NAL unit types, and raises some NAL unit type related questions for discussion.
Make access unit delimiters mandatory? (without a decoding process) (Put POC in there?) Put it in SEI,
and make SEI the first NAL unit of the picture?
Also, it was proposed to add slice types 3 to 5, with similar semantics as slice types 5 to 7 in AVC. No
action taken – it doesn’t seem clear that this trick had substantial value for AVC, and it was informally
reported that some encoders may have violated the constraint (if the values cannot be relied on, the trick
is useless).
Some other aspects were discussed in the contribution, such as slice-level SEI.
For further study.
JCTVC-G158 Undiscardable Leading Picture for CRA [Hendry, S. Park, B. Jeon (LG)]
In current WD 4 of HEVC, decoder flushes all reference pictures in Decoded Picture Buffer (DPB) prior
to decoding the first key picture that follows a CRA picture in decoding order. This contribution suggests
that in some use cases when some reference leading pictures should not be flushed out, which are called
Undiscardable Leading Pictures (ULPs), prior to decoding the key picture and are allowed to be used as
reference for inter prediction for pictures that follow the key picture in order to improve coding
efficiency. The contribution proposes some syntax and semantics of new elements to signal ULPs in
header of CRA slice. It is reported that by using special input sequences that contains scene change before
CRA picture, modified HM-4.0 that implements ULP concept gives gains 0.2 % Y, 0.2% U, 0.2 V for
RAHE and 0.3% Y, 0.1% U, 0.2% V for RALC.
Suggestion: This is already supported, by using a recovery point SEI message (already included in
HEVC, in principle), rather than using IDR or CRA.
JCTVC-G533 On syntax for clean random access (CRA) pictures [Y. Park, H. Yang, C.
Kim (Samsung)]
This contribution discussed syntax for clean random access (CRA) pictures.
Some new material had been added in revisions of the original contribution.
The contributor assumed that the “leading pictures” are decoded and displayed, which is not the model of
what we refer to as CRA. As currently specified, decoder behaviour for random access is not specified (as
this functionality is considered out of the scope of the standard).
It was suggested that some of the concerns expressed in the contribution might be addressed by adding
some informative note in the standard to describe how a system can use CRA pictures (e.g. with some
external signal to indicate that random access is being performed or with some external discarding of
“leading pictures” from the bitstream before the decoder processes the remaining data.
JCTVC-G584 Temporal layer access pictures and CRA [Jonatan Samuelsson, Rickard
Sjöberg (Ericsson)]
The contribution explores the relationship between the current CRA design and temporal layering.
This contribution presents a proposal to change the signaling of Clean Random Access (CRA) pictures
and Temporal Layer Switching Points in what is referred to as Temporal Layer Access (TLA) pictures. It
is proposed to replace the CRA Network Abstraction Layer (NAL) unit type with a TLA NAL unit type.
The proposed TLA NAL unit type imposes constraints on the bitstream and does not have an impact on
the decoding process.
It is stated in the contribution that both random access information and temporal layer switching
information is of high value to a network node and thus should be available in NAL unit header,
independent of data outside that NAL unit header, specifically in the Sequence Parameter Set (SPS) and
Picture Parameter Set (PPS). It is further stated that a unified syntax and definition of TLA pictures
makes the standard text more readable and comprehensive.
It was remarked that a CRA must be an intra picture.
JCTVC-G157 Reference List Construction for Random Access Settings [Hendry, S. Park,
B. Jeon (LG)]
(This had not been reviewed in the BoG relating to DPB topics.)
In the 6th JCT-VC meeting, based on contribution JCTVC-F433 and JCTVC-F701, reference picture list
construction by using 3 higher quality and 1 nearest reference pictures has been adopted in the common
conditions for low delay settings. This contribution proposes to construct default reference picture lists
differently. When constructing RefPicListX, the proposed scheme suggests sorting reference pictures in
Decoded Picture Buffer by POC first and then by the picture-level QP value relative to the QP value of
current picture, instead of only by POC as it is done currently.
It was noted that the described behaviour is only correct for B pictures, and that the description is only a
matter of the default list order; the default can be changed by reference picture list modification syntax. If
this behaviour is desirable, it can be done explicitly by the encoder in this way.
It was remarked that this adds an extra sorting step and complication to the initialization.
It was noted that explicit mode reference picture marking can also be used to change the default list
initialization values.
It was noted that larger aspects of reference picture list construction are being considered, and remarked
that this may be an over-optimization relative to excessively emphasizing our common conditions
configurations, which are not part of the standard – rather, they are just a matter of how we are using the
standard in some example tests.
It was noted that the selection of QP values could end up being manipulated for purposes of reference
picture list construction, which seems like an unusual repurposing, and might result in sending more PPS
syntax structures so that this manipulation can be done.
The group found it interesting that gain was being reported due to using a different set and ordering of
reference pictures than what is our current common conditions. So there could be an opportunity here for
non-normative coding efficiency improvement of future common conditions (0.4% Y, 0.4% U, and
0.4% V for RA HE).
It was noted that G589 showed some gain (0.2%) relative to our current common conditions while using
much less picture storage. See notes elsewhere. Revisit tThis aspect was discussed again on Nov 29 (FB
chairing). Further study was encouraged.
JCTVC-G166 AHG21: Explicit Reference Pictures Signaling with Output Latency Count
Scheme [Hendry, S. Park, B. Jeon (LG)]
Reviewed in BoG.
JCTVC-G198 AHG21: Inter reference picture set prediction syntax and semantics. [T.K.
Tan, C.S. Boon (NTT Docomo)]
Document JCTVC-F493 proposed the explicit signaling of reference pictures needed for the inter
prediction of the current and future pictures, using buffer descriptions (reference picture sets). A
reference picture set is a set of ΔPOC values. ΔPOC values are picture order count (POC) of the reference
pictures relative the current picture. Template reference picture sets are signaled in the picture parameter
set (PPS) and referred to by each slice.
This contribution proposes to further reduce the amount of bits necessary for signaling the reference
picture set by predicting the ΔPOC values using the ΔPOC values from a reference picture set already
present in the PPS.
Based on the latest draft of the reference picture set syntax from the ad hoc group on Reference picture
buffering and list construction (AHG21), the number of additional PPS signaling bits needed for the
random access (RA) and low delay (LD) common conditions are reported to be 288 and 201 bits,
respectively. Using the proposed inter reference picture set prediction method, the numbers of bits needed
are reported to be reduced to 144 and 106 bits, respectively. This represents a reduction of 50% and 47%,
respectively.
It was noted that there are multiple inputs that remain under consideration regarding the APS and details
of the RPS design.
It was suggested that this may be a degree of over-optimization within the context of a scheme that is not
yet a really settled area of the design. There are also multiple ideas on the table that are available for
compressing the number of bits needed for the RPSs. G643 was suggested as one example. However, it
was also suggested that this proposal has been well studied and implemented, has good text, etc., and
seems relatively mature. It was remarked that without this proposal, the current G1002 scheme would
have an obvious redundancy in relation to cyclic picture structure encoding.
Decision: Adopted (Part 1 “full inter-RPS prediction”).
JCTVC-G398 High-level Syntax: Marking process for non-TMVP pictures [B. Li (USTC),
J. Xu (Microsoft), H. Li (USTC)]
(Chaired by J. Boyce.)
JCTVC-G526 AHG21: Combined signaling for reference picture set [Y. Park, I.-K. Kim,
C. Kim (Samsung)]
Reviewed in BoG.
JCTVC-G546 On high-level syntax for maximum DPB size and frame latency [Y. Park, K.
P. Choi, C. Kim (Samsung)]
Chaired by J. Boyce.
JCTVC-E339 proposed to move max_dec_frame_buffering and num_reorder_frames from the optional
VUI to mandatory SPS. The JCTVC-F541 proposed to add max_latency_frames_plus1 or
max_latency_increase_plus1. We propose move max_dec_frame_buffering and add
max_latency_frames_plus1 in SPS. We propose the num_reorder_frames to be left in VUI without
change.
If an encoder doesn’t send the VUI parameters, capabilities determination by decoder would be
negatively impacted.
Without max latency, output is delayed.
Suggestion to also move num_reorder_frames to SPS.
Decision: Adopt put three syntax elements in the SPS, max_dec_frame_buffering, num_reorder_frames,
and use max_latency_increase . (Also JCTVC-G779)
JCTVC-G635 Coding with a unified reference picture list [M. Naccari (BBC), G. Van
Wallendael (Ghent University), M. Mrak, D. Flynn (BBC)]
The unified reference picture list (LU - List Unified) was presented in contribution JCTVC-F549 with the
aim of providing a simpler and more flexible structure to map the reference picture used during the inter
coding process in the HM codec. The main idea behind the LU is to simplify mapping of reference
pictures by using only a single reference list whereby reference frame pairs are stored. A reference pair
consists of two reference frames (in the case of bi-directional prediction) or one reference frame and a
null element (in the case of uni-directional prediction). It was asserted that the usage of the LU reduces
the bitstream parsing and enables adding/removing some combinations of references in a more flexible
fashion than the current HM design using two reference lists. In this context, this contribution addresses
the reference list indexes usage in the current HM 4.0 codec and describes an implementation of the LU
scheme based on the default HM reference settings. It is reported that the experimental results obtained
for this implementation show that the LU scheme can handle usual HM conditions, while providing a
space for more flexible selection of reference frames.
It was asked what is the impact of the scheme on coding efficiency? Some loss in compression was
reported. It was suggested that this was due to the fact that the different method of coding of the reference
picture indexes was not included in the R-D decision-making process. (At the previous meeting, some
gain had been shown when using a similar but slightly different scheme, in a usage that included the
scheme within the R-D optimization.)
It was remarked that this may require a difficult coupling of the encoder’s decision-making process of
joint selection of the two reference pictures to use for references. For example, if there are 10 pictures in
each list, then there would be 120 entries needed in the combined list. Initalizing, reordering, and
managing such a large list might get difficult. The overhead for reference picture list reordering might be
large.
It was remarked that the only clear benefit would seem to be simplification of the parsing of the reference
indexes at the PU level. Further study would be needed to identify and clarify whether a significant
benefit can be shown for this concept.
JCTVC-G549 Syntax rearrangement for list combination [Y. Park, S. Jeong, C. Kim
(Samsung)]
A list combination (LC) scheme was proposed for uni-prediction at B-slices to improve coding efficiency.
The current syntax bit ref_pic_list_combination_flag seems a bit redundant.
Proposes to put a default combined list length in the PPS and modify the slice-level syntax and remove
that bit.
There were some differences in the proposed design aspects in the newer version of the proposal.
JCTVC-G717 Improvements on reference picture buffering and list construction [Y. Yu,
K. Panusopone, X. Fang, L. Wang (Motorola Mobility)]
This contribution proposes changes of reference picture construction of combined list and an explicit way
for signalling collocated pictures according to the value of delta POC. The proposed scheme was reported
to be more efficient to build the combined list and signal the collocated picture.
The proposal assumes the RPS (G021) style of buffer control.
It is proposed to allow any picture within list 0 or list 1 (or perhaps within the RPS list) to be specified to
be the “collocated picture” (using an index syntax element in the RPS syntax or in the slice header).
Currently the collocated picture is always the first picture in list 0 or the first picture in list 1.
Some tests were done to see that the proposed technique works; however, there was no compression
benefit shown overall. Further study would be needed to determine whether there may be a significant
benefit for this concept.
Also proposed was a change of the default order of combined reference picture lists, based on pair-wise
minimization of POC distance. Test results were not provided, so the work seemed somewhat
preliminary, and further study would be needed to determine whether it has value.
JCTVC-G637 AHG21: Long-term pictures and pruning of reference picture sets [Rickard
Sjöberg, Jonatan Samuelsson (Ericsson)]
Reviewed in BoG.
JCTVC-G1036 Common conditions for reference picture marking and list construction
proposals [Y.-K. Wang, M. M. Hannuksela, T.K. Tan (editors)] [late] [miss]
qq
JCTVC-G979 Consideration on the Hybrid Structure of Channel, Scene, and Object based
3D Audio Systems [Jeongil Seo, Kyeongok Kang] [late]
JCTVC-G1028 Non-CE4: Rate control friendly spatial QP prediction [H. Aoki, K. Chono
(NEC), M. Kobayashi, M. Shima (Canon), M. Coban, M. Karczewicz
(Qualcomm), K. Sato (Sony)] [late]
Compared to preceding QP in scan order, more gain would be achieved by this scheme than for the
comparison to HM 4. Further study in a CE.
JCTVC-G094 Non-CE4: Carriage of large block size quantization matrices with up-
sampling [M. Zhou, V. Sze (TI)]
Upsampling of quant matrices (somewhat discussed elsewhere). Detailed presentation did not seem
necessary at this time.
JCTVC-G152 Method and syntax for quantization matrices representation [X. Zhang, S.
Liu, S. Lei (MediaTek)]
Detailed presentation did not seem necessary at this time.
JCTVC-G578 Non-CE4: Quantization matrix compression and signaling [R. Joshi, J. Sole,
M. Karczewicz (Qualcomm)]
Detailed presentation did not seem necessary at this time.
JCTVC-G880 HVS Model based Default Quantization Matrices [M. Haque, A. Tabatabai,
Y. Morigami (Sony)] [late]
This document presents a set of default Quantization Matrices designed by using a Human Visual System
(HVS) Model for HEVC. A list of these matrices are provided in the appendix. The contributor reported
that the proposed matrices provided some subjective quality benefit (in informal subjective testing).
Decision: Adopt these as the starting point default values for 16x16 and 32x32 and use AVC defaults for
the other cases.
JCTVC-G1026 Using Multiple APSs for Quantization Matrix Parameters Signaling [Ming
Li, Ping Wu (ZTE), Junichi Tanaka, Yoshitaka Morigami, Teruhiko Suzuki,
Kazushi Sato (Sony)] [late]
This late contribution was provided as a step toward dealing with the issue of conditional update of APSs.
It was agreed that this should be studied as part of future AHG activity.
5.10.3 Dequantization
Definitions
B = source bit width (8 or 10 bit in the experiments described below)
DB = B-8 (internal bit-depth increase with 8-bit input)
N = transform size
M = log2(N)
Q = f(QP%6), where f(x) = {26214,23302,20560,18396,16384,14564}, x=0,…,5
IQ = g(QP%6), where g(x) = {40,45,51,57,64,72}, x=0,…,5
It is asserted that the worst case dequant has a 17 bit signed multiply range.
JCTVC-G723 Support for finer QP granularity in HEVC [K. Panusopone, A. Luthra, Xue
Fang, J. H. Kim, L. Wang (Motorola Mobility)]
Closely related to G721, does not need further detailed review.
JCTVC-G093 AHG22: Sample-based angular prediction (SAP) for HEVC lossless coding
[M. Zhou (TI)]
Efficient lossless coding is required for real-word applications such as automotive vision, video
conferencing and long-distance education. This contribution proposes to use sample-based angular intra
prediction (SAP) in lossless coding mode for better coding efficiency. The proposed sample-based
prediction is exactly same as the HM4.0 block-based angular prediction in terms of prediction angles and
sample interpolation, requires no syntax or semantics changes, but differs in decoding process in terms of
reference sample selection. In the proposed method a sample to be predicted uses its direct neighboring
samples for better intra prediction accuracy. Compared to the HM4.0 anchor lossless method which
bypasses transform, quantization, de-blocking filter, SAO and ALF, the proposed method provides an
average gain of 6.71% in AI-HE, 7.83% in AI-LC, 2.29% in RA-HE, 2.64% in RA-LC, 1.57% in LB-HE
and 1.85% in LB-LC. For class F sequences only, the average gain is 8.95% in AI-HE, 8.88% in AI-LC,
5.64% in RA-HE, 5.61% in RA-LC, 4.58% in LB-HE and 4.61% in LB-LC. SAP is fully parallelized on
the encoder side, and can be executed at a speed of one row or one column per cycle on the decoder side.
Presentation not uploaded
The anchor is a modified HM with transform bypass, quant. bypass as described above, entropy coding is
used as is, i.e. the contexts are becoming spatial. Directional prediction “as is” is included here for intra
coding.
Goal is to use HM tool “as is” without major re-design.
JCTVC-G268 AHG22: Lossless Transforms for Lossless Coding [W. Dai, M. Krishnan, J.
Topiwala, P.Topiwala (FastVDO)]
This submission is based on proposal G266, “Lossless Core Transforms for HEVC,” and focuses on the
lossless transforms themselves. This work is submitted in reference to AHG22 on Lossless Coding. A
coding framework has been created in that AHG for development purposes, which at this time includes
prediction, and entropy coding, but bypasses transform and quantization. It is the purpose of this proposal
to supply lossless transforms, which can assist in data decorrelation, and potentially improve the
performance of the lossless coding framework. Such experiments will be conducted and reported in the
next meeting cycle.
Explain ways how to construct lossless transforms: Lifting.
JCTVC-G664 AHG22: A lossless coding solution for HEVC [W. Gao, M. Jiang, Y. He, J.
Song, H. Yu (Huawei Technologies)]
This contribution proposes a lossless coding solution that only involves a few modifications to the current
HEVC WD. To achieve lossless coding in both intra and inter coding operations, the transform and
quantization modules are bypassed. Due to the nature of lossless coding, the existing intra predictions are
extended to pixel based prediction (DPCM), taking into account that there is no transform applied to
prediction residuals. Furthermore, since the statistical properties of prediction residuals are quite different
from those of transform coefficients, the CABAC coding of intra prediction residuals are also modified
accordingly.
No additional flag is introduced in this proposal, the lossless coding mode is signaled by QP Y=0. As a
result, no change to the HEVC syntax specification is needed, and the lossless mode can be applied to the
entire picture or to individual CUs conveniently.
Combination of CABAC and Golomb-Rice is used.
Anchor is HM with QP=0 which is still lossy. Compared to that, the proposed method saves roughly 9.5%
for AI and roughly 7% for LD B
JCTVC-G264 AHG18: Adaptive Resolution Coding (ARC) [T. Davies (Cisco), P. Topiwala
(FastVDO)]
This contribution reports the results of further investigations into resolution adaption. Adaptive
Resolution Coding (ARC) is described, in which resolution is selected dynamically by means of a
threshold test, and the compression performance investigated in a rapid bit-rate reduction/high QP
scenario against simply increasing QP with and without pre-filtering. Mean luma BD-Rate gains range
from 6.9-14.3% in Class A and 3.5-11.5% in Class B but performance depends greatly on picture content
and configuration. Individual luma BD-rate gains can range from negligible up to 30%. Typically chroma
has lower objective quality by a fixed penalty of around 0.5dB but this is asserted to have little subjective
impact. The contribution includes modified Working Draft text.
Resolution can be changed in both directions (up or down)
Heuristic criterion based on PSNR
Results for high QP
PSNR is computed at high resolution
Could the same be achieved by pre-filtering? It is said that some investigations on this were performed,
but same performance was never achieved
Upsampling and downsampling filters would need to be normative (for the results, down- and upsampling
filters were aligned) – in contrast to that, in scalable coding only upsampling needs to be defined
An application scenario is transmission with large variation of bandwidth
A worst case would be when resolution changes every frame, which would require a more complex
decoder – would be necessary to restrict frequency of changes
Would the decoder need to keep both resolutions all the time? Can down- and upsampling be performed
on the fly in any case of frame structure?
One expert mentions that upsampling and downsampling filters should be consistent with possible
scalable extension. Put studying relation with scalable coding in mandates of AHG.
From Tue. 29th discussion:
Normative (in-loop) down- and upsampling is only necessary at the switching points (which should be
rare). Therefore, it may not be necessary in the context of ARC to define sophisticated
decimation/interpolation filters.
One expert mentions that reduced DPB memory could also be an advantage, where only lower resolution
references are stored, but prediction would be performed at high resolution.
Revisit: Conclusions on ARC were not possible Mon 28th evening as the relevant experts were not
present. Continue AHG
Investigate use of simple filters (e.g. bilinear) for the down- and upsampling switching
Investigate subjective quality, as frequent switching may be annoying
Investigate necessity of keeping both resolutions in buffer
Relationship with alternatives: pre / post processing
JCTVC-G971 Crosscheck of BBC's Transform Skip proposal [H. Yang, X. Zheng, H. Yu]
[late]
JCTVC-G586 Parallelizable context for significance coding of large transform blocks [J.
Kang, J. Lainema, A. Hallapuro, K. Ugur (Nokia)]
5.11.4 Other
JCTVC-G795 Crosscheck of TI’s JCTVC-G669 on delay dependent intra frame for video
conferencing [J. Min, Y. Piao, J.H. Park (Samsung)] [late]
5.12.1 CAVLC
JCTVC-G247 Non-CE5: CAVLC counters normalization per LCU [E. François, S. Pautet,
C. Gisquet (Canon)]
No need to review
JCTVC-G312 Non-Square Partition Mode Grouping for CAVLC [T. Yamamoto (Sharp)]
No need to review
JCTVC-G355 Non-CE5: joint coding of splitting flag and inter modes [W. Zhang, P. Wu
(ZTE)]
No need to review
JCTVC-G741 Cross-check report on joint coding of splitting flag and inter modes
(JCTVC-G355) [I.-K. Kim (Samsung)] [late]
JCTVC-G905 Crosscheck report of ZTE's joint coding of splitting flag and inter modes
(JCTVC-G355) [X. Wang (Qualcomm)] [late]
JCTVC-G365 Non-CE5: Redefined contexts for last nonzero coefficient coding of 4x4 TU
in CAVLC [J. Xu, A. Tabatabai (Sony)]
No need to review
5.12.2 CABAC
JCTVC-G829 Context modeling of split flag for CABAC [W. -J. Chien, M. Karczewicz]
Is discussed in context of CE1 BoG – done (see under G1022)
JCTVC-G849 Non-CE1: Crosscheck for Qualcomm's context modeling of split flag for
CABAC in JCTVC-G829 [T.-D. Chuang, Y.-W. Huang (MediaTek)] [late]
JCTVC-G326 Non-CE1.b: On the exponential memory decay probability update [J. Sole,
M. Karczewicz (Qualcomm)]
Modifications to the memory decay function for probability update in CABAC as presented in JCTVC-
F254 are proposed in order to have a symmetric estimator that disallows highly skewed distributions.
Further modifications include the replacement of the range multiplication by a series of shifts, thus
largely reducing the size of the LPS table. The BD-rates for AI-HE, RA-HE and LB-HE configurations
are 0.50%, -0.61%, and -0.55%, respectively.
Relates to G764
Much smaller LPS table (factor 288)
The table to be added would increase the current HEVC table size by factor 1.5
However, it is necessary to perform 5 shifts and 3 additions (could be replaced by multiply and one shift).
The added complexity would not justify adoption.
Combination with multi-parameter would be possible (as in CE1b) and increase performance but likely
improve table size by 5.
JCTVC-G413 Modified probability update and table removal for multi-parameter CABAC
update (F254) [C. Rosewarne, M. Maeda (Canon)]
This contribution presents a method of updating probability estimates of a context model that enables the
substitution of the look-up table required for determining an offset for the range update with a function.
The intention of removing the look-up table is to reduce area for hardware implementations. The intention
of the modification to the probability estimate updating method is to remove one operation from the
substituted function, which we assert provides a timing benefit for hardware implementation. When
implemented on the multi-parameter version of JCTVC-F254 integrated in HM-4.0, the proposed
technique has a 0.01% increase in IA-HE, a 0.02% increase in RA-HE, a 0.00% effect on LB-HE and a
0.01% increase in LP-HE.
Restrict range of probability update operation.
Bitrate reduction 0.9%, 0.9%, 0.8% and 0.4% for AI, RA, LDB and LDP
JCTVC-G805 Cross-check of Canon’s modified probability update and table removal for
multi-parameter CABAC update (JCTVC-G413) [J. Sole (Qualcomm)] [late]
JCTVC-G440 Non-CE1: Modified probability model update for complexity reduction [A.
Tanizawa, T. Shiodera, T. Yamakage (Toshiba)]
This contribution presents a technique to reduce complexity for CABAC probability model update
process. In this contribution, the coding efficiency of probability model update for several syntax
elements was evaluated, and a recommended combination is proposed.
Experimental result by disabling of probability model update for six syntax elements shows less than
0.1% BD-rate performance changes for four (IO/RA/LB/LP) HE conditions.
One comment: Invoking probability process depending on check of the given syntax element may be even
more complex than always doing it.
JCTVC-G833 Non-CE1: Adaptive initialization for CABAC with fixed probability contexts
[L. Guo, J. Sole, X. Wang, M. Karczewicz (Qualcomm)]
This contribution describes an adaptive initialization method for CABAC with fixed probability contexts.
At the beginning of each slice, a context can be initialized in one of two ways: 1) initialized using the pre-
defined (m,n) value (i.e., original HM method), or 2) initialized using a state value selected by the
encoder. For each context, the encoder signals the selection of initialization method to the decoder as well
as the new state value (if the second method is selected). No probability/state update will be performed in
the encoding/decoding process, and thus the proposed method is friendly to possible parallel-processing
applications. Experiments in HE compared to CABAC with context update show a BD rate change of
0.3%, 2.4%, 2.8%, 2.6% for AI, RA, LD and LDP, respectively. If combined with a new set of (m,n) that
JCTVC-G324 Modified LPS range and state transition tables for BAC J. Sole, M.
Karczewicz (Qualcomm)
The proposal modifies the binary arithmetic coder to allow a slower probability adaptation process and
the coding of more skewed binary distributions than HM4.0. The entries of the range LPS table are
changed and the number of probability states increased from 64 to 128. BD-rate for AI-HE, RA-HE and
LB-HE configuration is -0.34%, -0.26%, and -0.05%, respectively. When the range LPS table size is
divided by 2, the BD-rates are -0.26%, -0.21%, and 0.07% for AI-HE, RA-HE and LB-HE, respectively.
A third variant using bit-shift operations that reduces the table size by 25% provides BD-rates of -0.36%,
-0.28%, and -0.13%.
Results are better for larger resolutions
Interest expressed: Versions of full and half table size
Investigate in CE
With proper adjusted initialization, results might even be better (include in CE plan)
JCTVC-G492 Maximum VLC Limits in CABAC Escape Coding [K. Sharman, J. Gamei,
N. Saunders, P. Silcock (Sony)]
This contribution presents two alternative sets of g_auiGoRicePrefixLen and g_auiGoRiceRange values,
which are used for CABAC escape codes. Possible inconsistencies and omissions in and between WD4
text and HM4.0 source code were also presented.
There is certainly some inconsistency.
Decision: adopt the following version (relative to the current software):
const UInt g_auiGoRicePrefixLen[4] = {8, 10, 10, 8}; (different from the 4.1 software in 3rd
element)
const UInt g_auiGoRiceRange[4] = {7, 20, 42, 70}; (already in the 4.1 software)
G700 was pointing out the same issue.
JCTVC-G837 Non-CE1: 8-bit Initialization for CABAC [L. Guo, R. Joshi, J. Sole, X.
Wang, M. Karczewicz (Qualcomm)]
This contribution describes an 8-bit initialization method for CABAC. The 8-bit m (slope) and 8-bit n
(intersection) are replaced by a single 8-bit InitIdx for each context. SlopeTable (16 elements) and
IntersecTable (16 elements) are introduced to convert the 8-bit InitIdx to a CABAC probability state in
the initialization stage. Two sets of table values are presented in this contribution. For both of these two
sets, the table look-up operations can be implemented using formula calculation and thus the table storage
5.12.3 Other
JCTVC-G659 Non CE1: Study of Entropy Coding Methods Complexity [M. Karczewicz,
I.S. Chong, X. Wang, R. Joshi (Qualcomm)]
This document summarizes the key issues in hardware implementation of CABAC and CAVLC
decoding. Throughput numbers obtained for these method for H.264/AVC are quoted. Statistics of
number of symbols and bins decoded per frame for the current test conditions are given.
Presentation not uploaded.
Informative contribution. Claimed that there is still reason to reduce throughput in CABAC
JCTVC-G568 HM 4.0 entropy coding complexity study and software improvements [M.
Viitanen, J. Vanne, T. D. Hämäläinen (TUT), J. Lainema, K. Ugur (Nokia)]
This contribution presents a software complexity analysis of CAVLC and CABAC entropy decoders. The
contribution also identifies some obsolete code as well as possibilities to avoid division operations in HM
software and proposes to clean up those in the next HM release. It is reported that the HM4 CABAC
JCTVC-G569 Single entropy coder for HEVC with a high throughput binarization mode
[J. Lainema, K. Ugur, A. Hallapuro (Nokia)]
This contribution presents a single entropy coding architecture for HEVC. The proposed architecture
contains current HM 4.0 CABAC engine in its entirety. In addition it contains a high throughput
binarization (HTB) mode for transform coefficient coding that can be enabled when maximum throughput
is desirable. The high throughput binarization is identical to the current HM 4.0 CAVLC coefficient
coding where CAVLC codewords are fed to the bypass coding of CABAC. The intention of this approach
is to significantly reduce the worst-case complexity of CABAC, making it more suitable for low
complexity use cases. It is reported that the proposed method reduces the number of context adaptively
coded bins by 61 % under the common test conditions, and by 88 % in the low QP range (QP 2 to 17)
where the complexity of CABAC is reportedly more problematic. It is further reported that the proposed
approach can improve objective coding efficiency of low complexity configurations by -1.1 % (AI_LC), -
2.1 % (RA_LC), -2.7 % (LB_LC) and -3.1 % (LP_LC), while high efficiency results stay unaffected
when utilizing the existing high efficiency CABAC binarization.
Main problem in throughput: Coefficients
Suggest to have two different binarizations: High efficiency mode and high throughput mode
In high throughput mode, it is suggested to use CABAC binarization for motion/mode/cbf and CAVLC
binarization for coefficients
Uses only 30% of previous CAVLC syntax elements, and 15% of code tables.
How much faster did the software run for extreme low QP? Not analysed
Amount of CABAC bypass bins at lower QP?
JCTVC-G904 Cross Check of G381 [Ankur Saxena, Felix Fernandes (Samsung)] [late]
General
The main part of G112 and G381 is the same, relating to the order of syntax elements. The proposal was
asserted to provide a substantial hardware decoder complexity savings.
Decision: Adopted the common part of G112 and G381.
An aspect particular to G381 was noted, regarding the placement of chroma relative to luma in the
minimum transform size case. In this aspect, G381 was suggested as preferable.
Decision: Adopted this aspect of G381.
An aspect particular to G112 was noted, regarding interleaving of chroma and luma in IPCM mode.
Decision: Define MaxIPCMcuSize and require it to always be at most 32x32 (to provide text consistent
with ability to disable IPCM).
It was noted that G118 (avoiding sending other stuff between luma and chroma IPCM) has a relationship
with this, but seems non-conflicting in spirit.
Another aspect particular to G112 was behaviour with MinTrafoSize equal to MaxTrafoSize in regard to
chroma. Two possible solutions were suggested.
Decision: Establish MinChromaTrafoSize = (chroma_format = = 4:4:4) ? MinTrafoSize :
( Max( MinTrafoSize − 1, 4x4 ) ).
JCTVC-G366 Non-CE11: Context reduction of significance map coding with CABAC [C.
Auyeung, J. Xu (Sony)]
JCTVC-G644 Multi-level Significant Maps for Large Transform Units [N. Nguyen, T. Ji,
D. He, G. Martin-Cocher, L. Song (RIM)]
The objective is to reduce the number of significant coefficient bins to be decoded for large transforms.
Averate bin count benefit was shown (3-4%), especially on larger block sizes (14% AI, 12% RA, 30%
LB).
Part of this contribution depends on G323.
When combined with G323, some coding efficiency benefit was reported (0.1−0.3%) and it was reported
that more use of the larger block sizes occurs.
The technique was also reported to work with other types of scans (although it fits best structurally with
the sub-block scan scheme).
The scheme seems particularly "clean" in conjunction with the 4x4 sub-block scan of G323.
Response is quite positive, at least when used in conjunction with G323.
Decision: Adopt (whichever variant, depending on G323 decision).
JCTVC-G1001 Cross-check of RIM's multi-level significant maps for large transform units
(JCTVC-G644) J. Sole [late]
JCTVC-G657 Encoding and decoding significant coefficient flags for small Transform
Units using partition sets [G. Korodi, J. Zan, D. He (RIM)]
JCTVC-G768 Reduced contexts for significance map coding of large transform in CABAC
[Y. Piao, J. Min, J. H. Park (Samsung)]
JCTVC-G781 Reduced chroma contexts for significance map coding in CABAC [Y. Piao,
J. Min, E. Alshina, J. H. Park (Samsung)]
JCTVC-G917 On significance map coding for CABAC [V. Sze (TI)] [late]
JCTVC-G986 Fast algorithm and some comments on the significance map coding [H. Zhu]
[late]
Information document (late) – contributor not available – not presented.
Context selection
Context selection
G917 simplification of context selection for high-frequency positions in order to increase parallelism
This was asserted to enable up to 5 bins per cycle of parallelism.
Some skepticism was expressed about the need for this; since it was a late document, it was suggested to
defer its consideration to further study.
JCTVC-G239 Non-CE11: Modified method for coding the positions of last significant
coefficients in the CABAC mode [S.-T. Hsiang, S. Lei (MediaTek)]
JCTVC-G947 Crosscheck for Modified method for coding the positions of last significant
coefficients in the CABAC mode (JCTVC-G239) [G. Clare, F. Henry
(Orange FT)] [late]
JCTVC-G900 Cross-check of binarisation of last position for higher throughput (G370) [V.
Drugeon, M. Narroschke (Panasonic)] [late]
JCTVC-G520 Non-CE11: Modified Context Derivation for last coefficient flag [H. Sasai, T.
Nishi (Panasonic)]
JCTVC-G554 Grouping of bypass bins for last position coding of transform coefficients [I.-
K. Kim, V. Seregin, J. H. Park(Samsung)]
JCTVC-G704 Last position coding for CABAC [W.-J. Chien, J. Sole, M. Karczewicz
(Qualcomm)]
JCTVC-G301 Test Results On Context simplification for coefficients entropy coding [X.
Che, W. Ding, Y. Shi (Beijing Univ. Tech.)]
JCTVC-G783 Context number reduction for level coding in CABAC [Y. Piao, J. Min, J.H.
Park (Samsung)]
JCTVC-G125 Cross-check of proposal on context reduction for coefficient levels (G783) [V.
Sze (TI)] [late]
General
5.13.5 Scans
JCTVC-G226 Non-CE11: Extending MDCS to 16x16 and 32x32 TUs [C.-W. Hsu, X. Zhao,
X. Guo, Y.-W. Huang, S. Lei (MediaTek)]
Intended to improve coding efficiency by adding two more scans; measured impact was approximately
−0.1%. This gain seems insufficient.
JCTVC-G285 Non-CE11: Methods for Solving the Parsing Issue of MDCS [X. Zhao, X.
Guo, C.-W. Hsu, T.-D. Chuang, Y.-W. Huang, S. Lei (MediaTek)]
Scans depend on intra mode; the suggestion is to remove that dependency. Similar loss (0.2%) with a CE
proposal on the subject. Asserted to be simpler than the CE proposal, which was not adopted. However,
the group did not think that the design had a significant problem that needed to be solved.
JCTVC-G531 Cross verification of MediaTek’s proposed methods for solving the parsing
issue of MDCS (JCTVC-G285) [Y. Chiu, L. Xu (Intel)] [late]
JCTVC-G958 Non-CE11: Modified context selection for significant coefficient flags with
diagonal sub-block scan [C. Rosewarne, M. Maeda] [late]
Depends on G323.
In G323, there are two locations where the current flag depends on the previous one. G323 addresses this
by excluding the previous flag from the context selection. Here this is dealt with by making a guess about
the probable value of the missing neighbour flag.
JCTVC-G123 Non-CE11: Simplified Coefficient Scans for NSQT [V. Sze (TI)]
JCTVC-G724 Non-CE11: Entropy coding for non-square TU blocks [Vivian Kung, Krit
Panusopone (Motorola Mobility)]
General
Suggestion to harmonize G323 and G1015 – how to do last significant coefficient coding.
Side work to do this in BoG (J. Sole) and revisit.
JCTVC-G271 Sign Data Hiding [G. Clare (Orange Labs), F. Henry (Orange Labs)]
Concept previously proposed in JCTVC-A114 and in JCTVC-E428.
JCTVC-G889 Crosscheck of G271 on Sign Data Hiding [Andrea Gabriellini, Marta Mrak
(BBC)] [late]
JCTVC-G372 Coding order of sign and level minus 3 with CABAC [C. Auyeung, T. Suzuki
(Sony)]
Suggests to send the sign flag after the level_minus3 instead of before it, for improved throughput in a
hardware implementation. This approach was suggested to require less memory and have lower latency,
at least in some implementations.
In hardware it was suggested that the proposed approach may be preferable to enable earlier availability
of some of the transform coefficient values from the decoding process.
Another participant suggested that in another implementation (software-based), the change would not be
beneficial – that since the (16) sign bits can be stored with less memory than the level values, it is
preferable to store the sign bits than to store the level values. Also, the level_minus3 would have many
more bins to decode than the sign.
Support for adoption was only expressed by the proponent, and further study was recommended to
determine whether there is a significant problem with the current approach.
JCTVC-G744 Cross-check report on Coding order of sign and level minus 3 with CABAC
(JCTVC-G372) [I.-K. Kim (Samsung)]
5.13.8 CBF
See also notes for JCTVC-G718.
JCTVC-G444 Proposed fix on cbf flag signaling [A. Minezawa, K. Sugimoto, S. Sekiguchi
(Mitsubishi)]
Proposes to modify condition for inferring cbf flag. Identifies one additional case where cbf for luma can
be inferred.
Tried multiple alternate configurations for min/max CU and TU size settings.
Cross-checked in G760.
Details of proposal were not easily understood. Proponents were asked to discuss the proposal with WD
editors and report back.
Revisit
This was discussed again on Nov. 29. B. Bross confirmed the correctness of the proposal which would
not worsen the quality of text.
Decision: Adopt.
Discussion
G718 proposes to share the CBF contexts for the U and V components in order to reduce the number of
CABAC contexts (saving 5).
The result is a small (less than 1%) improvement in coding efficiency of V component.
Decision: Adopt this aspect of G718.
5.13.9 CAVLC
Not reviewed due to decision to remove CAVLC.
JCTVC-G537 Table reduction and Improvement of last position coding in CAVLC [C.
Kim, Y. Park, J.H. Park(Samsung)]
JCTVC-G685 Selective Run-Level Coding for CAVLC [S.-H. Kim, A. Segall (Sharp)]
JCTVC-G202 Non-CE2: Modified NSQT coefficient scan for CAVLC [C.-W. Hsu, Y.-W.
Huang, S. Lei (MediaTek)]
General
G201, G370, G554, part of G520
Group together the bypass bins as done in some other places
Decision: Adopted.
Remark: MSBs first for last_significant_coeff_x, _y, please
Decision: Adopted.
G239, G520, G704
Change of binarization to reduce bin count, reduction of number of contexts, simplification
of context selection
G239 was reported to do both, but contrib does not report the average (max 5 CABAC
coded bins for something rather than 16, no LUT for CABAC bins, adds 2*4 contexts,
0.1% improvement in coding eff)
G704 similar (drops 2 contexts)
G520 same binarization as somewhere else in the standard, reduction in number of
contexts (loss of efficiency 0.06-0.08%, up to 0.23% loss if the number of contexts is
reduced – asserted to have no loss if number of contexts is not reduced, but that doesn’t
seem to be described).
Decision: Adopt G704 (with adjustments above and below).
Remark: Harmonize unary code convention – should use 1's followed by a 0. Decision:
Agreed.
JCTVC-G173 Cross-channel intra chroma residual prediction [Y. Chiu, Y. Han, L. Xu, W.
Zhang, H. Jiang (Intel)]
JCTVC-G346 Chroma intra prediction based on residual luma samples [K. Kawamura, T.
Yoshino, H. Kato, S. Naito (KDDI)]
JCTVC-G1009 A joint contribution on the coding tools of residual prediction for intra
chroma prediction [Y. Chiu, Y. Han, L. Xu, W. Zhang, H. Jiang (Intel), K.
Kawamura, T. Yoshino, H. Kato, S. Naito (KDDI)] [late]
JCTVC-G1024 Report of combining two coding tools for Chroma intra prediction (G173
and G358) [Xingyu Zhang, Oscar Au, Xing Wen, Yi-Jen Chiu, Yu Han,
Lidong Xu, Wenhao Zhang] [late]
JCTVC-G419 Inconsistency of intra LM mode between HM and WD [J. Lee, S.-C. Lim, H.
Y. Kim, J. S. Choi (ETRI)]
Partly on padding, and partly on prediction of chroma from luma.
JCTVC-G358 New modes for chroma intra prediction [X. Zhang, O. C. Au, J. Dai, F. Zou,
C. Pang, X. Wen (HKUST)]
JCTVC-G273 Crosscheck for JCTVC-G358 new modes for chroma intra prediction [J.
Dong (InterDigital)] [late]
JCTVC-G955 Joint contribution on the integration of several chroma coding tools [Gisquet
Christophe (Canon), Chiu Yi-Jen (Intel), Minezawa Akira (Mitsubishi),
Ichigaya Atsuro (NHK)] [late]
5.14.1.3 SDIP-related
JCTVC-G354 Non-CE6: Improvements for SDIP [J. Xu, E. Maani, A. Tabatabai (Sony)]
Discusses the new part of G558.
In this proposal, two modifications are proposed relative to SDIP for coding efficiency improvement.
First, MPMs in intra mode coding for non-square CU are modified and achieves BD rate saving −0.04%
for AI_HE and −0.1% for AI_LC. Second, contexts for the significance map of non-square TUs are
redefined, which provides additional −0.1% for AI_HE. Combing both algorithms, there is −0.14% BD-
rate saving for AI_HE.
Within the context of SDIP, this was agreed to be an improvement of the reference design.
JCTVC-G804 Crosscheck for Sony's JCTVC-G354 on Improvements for SDIP [C. Lai, L.
Liu, J. Zheng (HiSilicon)] [late]
JCTVC-G135 Non-CE6: Rectangular (2NxN and Nx2N) Intra Prediction [S. Liu,
X. Zhang, Z. Zhou, S. Lei (MediaTek)]
This contribution proposes to add two new intra prediction modes, i.e. 2NxN and Nx2N Intra prediction
to the existing 2Nx2N and NxN (only in SCU) intra prediction modes in current HM. Experimental
results report an average 1.6% BD rate reduction for AI HE, with encoder run-time increase 31%; or
average 1.52% BD rate reduction for AI HE, with an encoder run-time increase 27% (with HM 4 non-
SDIP branch as the anchor). Decoding run-time is increased by 3% on average. All results are generated
by current software implementation; further coding efficiency improvement and/or implementation
complexity reduction are reported to be expected with further investigation.
In intra, currently, "PU" is the level at which the prediction type is indicated, and "TU" is the level at
which the prediction type is operated.
Although the title of the proposal refers to 2NxN and Nx2N, at the TU level the sizes are 2Nx(N/2) and
2Nx(N/4) and the rotated equivalents. At PU 32x16, the TU is 32x8 and 32x2; at PU 16x8, the TU is
16x4; at PU 8x4, the TU is 8x2.
JCTVC-G754 Non-CE6: Line buffer reduction for CABAC context of SDIP syntax [L.
Guo, X. Wang, M. Karczewicz (Qualcomm)]
SDIP (short-distance-intra-prediction) introduces two extra syntax elements sdip_flag and sdip_direction.
The CABAC context modeling of these two syntax elements involves the corresponding syntax values
from above blocks, and thus introduces line buffer storage. This contribution describes modified CABAC
contexts for sdip_flag and sdip_direction that avoid upper block data and thus eliminates the line buffer
for these two syntax elements. The average B-D rate change is 0.0% and no encoding/decoding time
change is observed.
It was commented that coding the segmentation as mode-level information without adding other context
models is preferable to using the additional context models for sdip_flag and sdip_direction. If that
suggestion is pursued, this modification would not be necessary. This seems desirable to study.
However, without that suggestion incorporated – within the context of SDIP, this was agreed to be an
improvement of the reference design.
General
It was agreed that our primary SDIP reference design is G558 + G754.
Consider at plenary level whether SDIP should be in WD 5.
JCTVC-G145 Non-CE6: Reducing Line Buffers for intra mode [J. Lim, Y. Jeon, S. Park,
B. Jeon (LG)]
JCTVC-G474 Non-CE6b: Crosscheck for LG's intra mode line buffer reduction in
JCTVC-G145 [T.-D. Chuang, Y.-W. Huang (MediaTek)]
JCTVC-G139 Non-CE6.d: Intra Prediction With Selective Secondary Boundary [G. Van
der Auwera, M. Karczewicz (Qualcomm)]
JCTVC-G374 Improving the Intra Prediction Based on a Uniform Probability Model [L.
Liu (HiSilicon and Huawei)]
The prediction values of intra prediction are calculated through the predefined angTable table in the HM.
A performance decrease is observed when there are varied diagonal textures. This contribution presents a
proposed additional table based on a uniform probability model. It is proposed to select either the original
and newly-proposed angTable value depending on the left and above intra prediction modes.
No benefit was shown for most sequences, but 0.9% benefit was shown for AI HE and 0.8% AI LC for
BasketballDill.
The proponent indicated that this method is just one potential solution for this issue, and other approaches
might be possible. The purpose of the proposal was essentially to provide information to point out that the
current table does not seem to fit the characteristics of all video sequences – perhaps because of the
particular effective angles of the current table.
JCTVC-G738 On angular intra prediction main array extension [M. Coban (Qualcomm)]
This contribution presents a proposed simplification to the subsampling process of the main array
extension scheme by using reduced-precision slope computation. The current design uses 12-bit precision
slope tables to compute the subsampling positions given a block size and the prediction angle. Using 8-bit
5.14.1.7 Padding
JCTVC-G791 Non-CE6: Simplified reference samples padding for intra prediction [T. Lee,
J. Chen, J. H. Park (Samsung)]
This contribution removes two sides checking of unavailable pixels range in reference samples padding in
HM4.0 to reduce the complexity. The unavailable pixels range are padded by the nearest available pixel
in one fixed direction instead of adapting direction or averaging by the nearest available pixels. It is
reported that no BD-rate loss is observed for all configurations (AI, RA, LB in HE, LC) with 1500-byte
slice mode setting or/and constrained intra prediction setting.
Two methods were tested. An additional method was added in a later revision.
The quality of the affected part of the software was discussed.
It was asked whether any testing was done for increased intra refresh; this had not been tested.
Testing was done with and without constrained intra prediction.
It was commented that Method 2 is appealing (simpler than Method 1, and both have no loss).
See notes below regarding G812.
JCTVC-G572 AHG16: Reference sample padding harmonization for intra DC mode [V.
Wahadaniah, C. S. Lim (Panasonic)]
This contribution reports the HM-4.0 simulation results of a revised reference sample padding scheme for
intra DC mode initially presented in JCTVC-F414. The revised padding scheme is reported to give
average BD-rate gains of 0.1% for intra-only setting with 1500-byte slices and constrained intra
prediction enabled. It is reported that encoding and decoding runtimes are not affected by the revised
padding scheme. It was proposed that JCT-VC consider the trade-off between design simplicity and
conceptual accuracy of intra DC prediction and subsequently decide whether a revised padding scheme
for intra DC prediction is desirable.
It was remarked that although this proposal may provide some (small) benefit, the current scheme is a
"cleaner" design that avoids undesirable interactions. No action.
JCTVC-G153 Non-CE6: On intra prediction mode coding [C. Yeo, H. L. Tan, Y. H. Tan,
Z. Li (I2R)]
JCTVC-G106 Cross-Check for G-153: On Intra Mode Coding [A. Saxena, F. Fernandes
(Samsung)]
JCTVC-G184 Non-CE6: Unified neighboring positions for intra mode coding [S.
Fukushima, H. Nakamura (JVC Kenwood)]
JCTVC-G359 Non-CE6: Coding of luma intra prediction modes that are not in the MPM
set [R. Cohen, X. Xu, A. Vetro, H. Sun (MERL)]
JCTVC-G456 Non-CE6: Cross check report of MERL’s intra prediction mode coding
(G359) [Atsuro Ichigaya (NHK)]
JCTVC-G418 Simplification of intra prediction mode mapping method [J. Lee, S.-C. Lim,
H. Y. Kim, J. S. Choi (ETRI)]
JCTVC-G423 Non-CE6: Remove potential duplicate modes from the candidate mode list
for chroma intra prediction [H. Yang, J. Zhou, H. Yu (Huawei)]
JCTVC-G707 Using CABAC bypass mode for coding intra prediction mode [K. Misra, A.
Segall (Sharp)]
JCTVC-G767 Non-CE1: Bypass coding of Intra prediction modes in CABAC [T. Lee, J.
Chen, J. H. Park (Samsung)]
5.15 Transforms
JCTVC-G386 Non-CE10: Cross Check Report for JCTVC-G272 Core Transform Design
for HEVC. [X. Zhang, O. C. Au, X. Wen (HKUST)] [late]
Revisit after survey of inputs.
JCTVC-G496 Core transform design for HEVC with 7 bit coefficients [A. Fuldseth, G.
Bjøntegaard (Cisco), M. Budagavi, V. Sze (TI)]
This contribution proposes a set of 7 bit transform matrices for HEVC, covering all transform sizes from
4x4 to 32x32. The proposed transform matrices were asserted to have the same properties as the 8 bit
transform matrices currently used in the HM transforms. The transform matrices and the associated
transform operations described in this contribution are proposed for the core transform design in HEVC.
The proposed transform design was reported to have the following properties: 16 bit data representation
before and after each transform stage (independent of the internal bit depth), 16 bit multipliers for all
internal multiplications, no need for correction of different norms of basis vectors during
quantization/dequantization, all transform sizes above 4x4 can reuse arithmetic operations for smaller
transform sizes, and implementations using either pure matrix multiplication or a combination of matrix
multiplication and butterfly structures are reportedly possible. BD-rate results vary between −0.1% and
0.2% for the average across all sequences and all classes for the low (−1, 5, 9, 13), “normal”
(22, 27, 32, 37) and high (36, 42, 47, 51) QP ranges. The 7 bit transform matrices reportedly offers
between 9% and 23% reduction in hardware costs when compared to the 8 bit versions currently used in
HM.
The contribution indicated that this should be considered an information contribution, pending conclusion
on CE10 work.
A participant commented that in addition to the average results, the worst case is interesting.
Another participant indicated that the RDO does not actually function properly for low QP. It would be
highly desirable to fix these issues. However, others commented that although the RDO is flawed in this
range, the results seem to tend to be consistent with expectations based on knowledge of transform
tradeoff characteristics.
Question: Why impose as a decoder operation rather than requiring the encoder to obey a constraint?
It was suggested that clipping in the decoder is more robust; probably decoders would do this anyway, in
order to cope with badly-designed encoders that don’t obey the specified limits.
Decision: We should specify clipping of the output of the dequant (i.e., the input to the inverse transform)
to a signed 16 b range.
JCTVC-G132 Hardware analysis of transform and quantization [M. Budagavi (TI), V. Sze
(TI), M. Sadafale]
Effectively covered in CE 10 report. Describes concept behind assertion of sharing of all multipliers and
some adders between encoder and decoder, involving transform matrix symmetry.
G495 has this property, and in G628 it was asserted that G737 does not.
It was asserted by the proponent of G737 that a (somewhat different) form of sharing of multipliers was
possible with the FF form of that transform.
Describes two hardware implementations based hard-wired and SIMD architecture (e.g. useful for multi-
standard processors) and asserts that the G495 design is better for this.
The specifics of these implementations were not released for the hand-optimized designs of either G495
or G737.
JCTVC-G265 Core Transform Property for Practical Throughput Hardware Design [M.
Tikekar, C.-T. Huang, C. Juvekar, A. Chandrakasan (MIT)]
Summarized as follows:
Implemented all transform sizes (and DST and non-square and control logic) for G495
Emphasizes number of unique coefficients, for which a substantial decrease (25%) of complexity
for G495 implementation was achievable by taking advantage of this property. (The proponent of
G737 asserted that this could apply also to that proposal.)
JCTVC-G857 SIMD Analysis of Some Core Spatial Transforms [S. Riabtsev (CSR)] [late]
[uploaded 2011-11-24 15:41:16]
Presenter not avilable.
Some of the results in this had been made available in the CE 10 report prior to availability.
Reports that both G737 and G495 can be implemented in real time and comments that the FF form of
G737 is the preferred form for hardware.
Discussion of this was requested.
JCTVC-G865 New Results for Guaranteeing 16-bit Transform Dynamic Range [K. Misra,
L. Kerofsky, A. Segall (Sharp)] [late]
Requests action already taken as recorded above for G782 and G719 in regard to decoder clipping after
dequant and after the first stage inverse transform (16 b clip in each place).
JCTVC-G977 Non-CE10: new 8-bits core spatial transform with fast algorithm [Hongbo
Zhu] [late]
Not yet presented (and late) – presenter unavailable when presentation requested. May be studied as
information.
Conclusions
Since G495 and G737 are the only candidate technologies that have been well-studied, other new
proposals would not be appropriate if we wish to make a decision at this meeting.
Both G495 and G737 seem like essentially good candidate designs.
Some submitted non-proponent analysis may indicate some preference for G737 in hardware (although
another non-proponent indicated that their internal analysis had led to a different conclusion). No
significant difference was observed between the two for software SIMD architecture.
G737 seems to need some modification as a bug fix, which would involve some loss of precision.
Given that scenario and the desire to try to close the topic and move on, the consensus was to select
G495.
Decision: Adopted G495.
JCTVC-G282 Non-CE7: Mode Dependent DCT/DST for Chroma [X. Zhao, X. Guo, M.
Guo, S. Lei (MediaTek), S. Ma, W. Gao (PKU)]
Application of DST to chroma, with 0.5% benefit for U and V components. G107 tested several cases,
and this is one of them.
JCTVC-G345 Non-CE7: Restricted mode-dependent 8x8 DST for Intra prediction [A.
Ichigaya, Y. Sugito, S. Sakaida, (NHK)]
Introduces MDST 8x8 with restriction of combination of PU and TU size. Overall benefit (relative to HM
anchor) is small (0.2%), but Class A showed more gain (0.6%).
JCTVC-G630 Cross Check of G345: Restricted Mode-Dependent 8x8 DST for Intra
prediction [A. Saxena, F. Fernandes (Samsung)] [late]
JCTVC-G591 Non-CE 7: Supplementary Results for the Rotational Transform [Zhan Ma,
Felix Fernandes, Elena Alshina, Alexander Alshin (Samsung)]
Describes a fast search for use of the syntax-signal ROT, provide 0.5% rate benefit at 10% encoder time
increase. (Without the fast search, the rate benefit is 0.9% at 29% encoder time increase.)
JCTVC-G328 AHG8: Bit depth of output pictures [Y. Chen, Y. -K. Wang, X. Wang, I. S.
Chong, M. Karczewicz (Qualcomm)]
(Track B.) Skepticism was expressed regarding this as a normative output definition behaviour. For
further study.
JCTVC-G156 CU Depth Pruning for Fast Coding Tree Block Decision [H. L. Tan, C. Yeo,
Y. H. Tan (I2R)]
This contribution presents a CU Depth Pruning algorithm for fast coding tree block (CTB) decision. The
proposed method attempts to terminate the CTB decision process by performing a one-level look-ahead
for the last sub-CU where possible. It is reported that the proposed method reduces encoding time by
about 8% with 0.1% Luma BD-Rate coding loss for the Random Access and Low Delay configurations.
It was remarked that we have a paramter "ECU" in the software that is also an encoding optimization
trick.
It was remarked that this may introduce some sort of assymmetry in the search. The patch is in the
contribution, which is available for further study.
JCTVC-G543 Early skip detection for HEVC [Jungyoup Yang, Jaehwan Kim, Kwanghyun
Won, Hoyoung Lee, Byeungwoo Jeon (SKKU)]
In this contribution, an early detection of "skip" mode is proposed to reduce an encoding complexity of
HEVC. The proposed method is in the same spirit with the early skip detection scheme implemented in
MPEG-4 Part 10 AVC/H.264 reference SW, but slightly modified to address the different encoding
scheme of HEVC. It is reported that the proposed method reduces the encoding time by about 33% with
BD-bit rate loss of 0.45% compared to HM4.0 encoder.
JCTVC-G573 Cross-check of Early skip detection for HEVC with Early CU Termination
(JCTVC-G543, JCTVC-F092) [S. Lee, S. Cho, S. Park, N. Eum (ETRI)]
[late]
JCTVC-G794 Cross-check of Early skip detection for HEVC (JCTVC-G543) [Kiho Choi,
Sang-hyo Park, Euee S. Jang] [late]
JCTVC-G687 Fast fractional motion search with adaptive searching point reduction [Wan-
Chi Siu (HKPU), Yan-Ho Kam (HKPU), Wai-Lam Hui (HKPU), Yui-Lam
Chan (HKPU), Yu Liu (ASTRI), Jenny Yan Huo (ASTRI)]
This contribution proposes an additional option of the approach of performing fractional motion search,
aiming at reducing the computational complexity of the encoder of the current HEVC reference model.
According to the simulation results, the proposed technique achieves 6% reduction in computational
complexity with a 0.44% loss in luma BD-rate.
Half pixel search with horizontal/vertical neighbors in first step.
The change would require (probably a lot) “more than 10 lines of code”.
Not to be included.
It was commented that the speed-up is not so large, and we have other work to focus on. It may be nice to
use in practice, but we do not consider it a sufficiently high priority to put this in our software at this time.
JCTVC-G114 [withdrawn]
JCTVC-G160 [withdrawn]
JCTVC-G177 [withdrawn]
JCTVC-G356 [withdrawn]
JCTVC-G368 [withdrawn]
JCTVC-G538 [withdrawn]
JCTVC-G562 [withdrawn]
JCTVC-G594 [withdrawn]
JCTVC-G595 [withdrawn]
JCTVC-G602 [withdrawn]
JCTVC-G670 [withdrawn]
JCTVC-G701 [withdrawn]
JCTVC-G866 [withdrawn]
JCTVC-G921 [withdrawn]
JCTVC-G964 [withdrawn]
JCTVC-G969 [withdrawn]
JCTVC-G061 HEVC issues (comments from USNB of WG11) [A. G. Tescher for USNB of
WG11]
TBA
Not reviewed.Some aspects relevant to current phase of work.
entropy coding, affirmative
plan to do a test, affirmative
8 b per sample in the test, affirmative
RA HE and LB HE, affirmative
HM 4 rather than HM 5 if necessary, affirmative (whichever works out)
Results of the test are planned to be available by the San Jose meeting.
JCTVC-G096 Items to be clarified in HEVC design [M. Zhou (TI), W. Wan (Broadcom),
T. Suzuki (Sony), A. Tabatabai (Sony)]
Items to be clarified:
JCTVC-G661 Parallel partition Profile & Level limits [Chad Fogg, Aaron Wells]
Describes the need to define limits related to slices, entropy slices, tiles and wavefront.
To be further discussed in AHG8.
JCTVC-G729 Proposal to start the discussion on HEVC profile/level [T. Suzuki (Sony)]
Requested AHG to discuss profiles and levels. Done.
Should “clean up” level definitions from AVC instead of blindly copying it.
Request to thoroughly test profiles in their definition process (e.g., 4:2:2 10 bit).
JCTVC-G1004 Proposed text for some features not yet integrated into JCTVC-F803 [M
Horowitz (eBrisk)] [late]
Updated version of JCTVC-F803. This text will be incorporated into next WD version released by
editors.
6.2 BoGs
JCTVC-G1002 BoG report on reference picture buffering and list construction [J. Boyce,
R. Sjoberg, Y.-K. Wang]
The BoG recommended adoption of the AHG21 working draft text modification, in JCTVC-G021. It was
also recommended to have an AHG until the next meeting, and that contributions to next meeting be
based upon the editors’ WD version which includes the JCTVC-G021 modifications.
The BoG requested to meet again this meeting to define test conditions with various picture coding
structure patterns for evaluation of long term reference picture contributions, and for bit rate savings
based contribution, based on a draft test conditions document prepared by Miska Hannuksela and
circulated on the reflector by Saturday.
The BoG also recommended to further discuss contribution JCTVC-G198 with a larger group (this was
later done in Track A).
Several contributions initially assigned to this BoG were considered to not be within its scope, so it is
recommended that they be re-assigned: JCTVC-G717, JCTVC-G635, JCTVC-G157, and JCTVC-G549.
Additionally, the BoG proposed postponing consideration of JCTVC-G715 until after the AHG18
discussion.
Questions and comments:
Are long-term reference pictures (LTRP) supported?
o Long-term pictures, in AVC, affect weighted prediction, temporal MV scaling, default
list construction, list modification, sliding window, and explicit reference picture
MMCO.
JCTVC-G1006 BoG report on Non-CE MV Coding Proposals [B. Bross, J. Jung, S. Oudin]
See under 5.8.1.
Annex A contains a review of intra mode coding proposals (blue = exactly the same, yellow =
basically the same as each other). Section 4.1 identifies overlapping ideas.
G145 proposes LCU boundary line buffer compression (0.0) or elimination (0.1 loss)
Page: 222 Date Saved: 2011-12-04
Decision: Eliminate the line buffer as proposed in G145
G184 proposes to change neighbors for consistency with how inter works
No action.
G423 on chroma coding
For further study
G707=G767 (part of G153), increased us of bypass coding
Decision: Adopt G707=G767.
Remark: Right now the bins for the remainder are coded starting from the LSB; let’s start
from the MSB. (Considering that it is coded in bypass mode, no effect on perf.)
Decision: Agreed.
G418&G109&G144 (part of G119), simplifying the case when neighbour PU size differs
Decision: Adopted.
Additional further aspect of G119 to default to planar if neighbours not intra or unavailable
Question: Does it affect non-intra coding efficiency? Probably not – and verbally reported
(by a non-proponent) not to (or to actually provide a little gain).
Decision: Adopted (part 2 of G119 as described above).
Suggestion: Note that currently at 64x64 level, only 4 modes are available, making the
parsing different than at lower sizes, and the prediction actually operates at the 32x32 level,
so there is no real benefit for the restriction (aside from trivial overhead reduction).
Decision: Agreed to support all 32x32 modes to be signalled at the 64x64 level.
This closes all of intra mode coding (in regard to contribs that only affect the coding of the
modes, not the values of the samples).
Regarding definition of nal_unit_type value for APS? Decision: Use 14 (previously reserved).
JCTVC-G1025 BoG report on tiles and wavefront parallel processing [M. Horowitz]
This report contains the summary of proceedings of the tiles and wavefront parallel processing BoG
meeting. The BoG was created to review contributions related to tiles and wavefront parallel processing.
The meeting was held on Friday evening, 18:00 to 22:15 and Saturday morning, 10:00 to 12:15,
November 25 and 26, respectively. Approximately 30 delegates attended.
The BoG reviewed and summarized input documents: G183, G968, G194, G802, G197, G961, G315,
G317, G318, G453, G454, G618, G199, G627, G722, and G815 and prepared associated
recommendations.
The BoG recommended adoption of G194, AHG4: Non-cross-tiles loop filtering for independent tiles.
G194 proposes a flag to indicate enabling or disabling in-loop filtering across independent tile
boundaries. The current HM design always filters across independent tile boundaries.
JCTVC-G1032 BOG report on intra prediction complexity reduction and filtering [R.
Joshi] [miss]
Break-out group meetings for intra prediction complexity and filtering were held on Saturday, Sunday
and Monday, Nov. 24–26, 2011. Approximately 25 delegates were present.
Regarding prediction modes supported at the 64x64 level, this was resolved as recorded in notes for
G1017.
The BoG recommended that some discussed modifications of DC prediction relating to G567 be studied
in a CE. In Track B, this was discussed, but it did not seem worth pursuing as a CE.
JCTVC-G1035 BoG report on resolving deblocking filter description [A. Norkin et al.]
qq
JCTVC-G1041 BoG report on subjective viewing test for deblocking filter proposals [A.
Norkin, M. Narroschke, K. Andersson, D. Flynn, X. Guo, G. v. d. Auwera]
qq
JCTVC-G777 The Art of Writing Standards: Some Shalls and Shoulds for Better Quality
Interop Specs [G. J. Sullivan (Microsoft)]
The document upload deadline for the next meeting was planned to be 8 Nov20 Jan. 20121.
As general guidance, it was suggested to avoid usage of company names in document titles, software
modules etc., and not to describe a technology by using a company name. Also, core experiment
responsibility descriptions should name individuals, not companies. AHG reports and CE
descriptions/summaries are considered to be the contributions of individuals, not companies.
JCTVC-F634 HEVC Reference Software Manual [F. Bossen, D. Flynn, K. Sühring (AHG
chairs)]
The intention is to provide this as part of the software package in the future.
Page: 244 Date Saved: 2011-12-04
JCTVC-F688 Revised HEVC Software Guidelines [K. Sühring, D. Flynn, F. Bossen
(software coordinators)]
This version approved at the 6th meeting was approved to still be valid.
JCTVC-G11F800 Meeting Report of 76th JCT-VC Meeting [G. J. Sullivan, J.-R. Ohm]
JCTVC-G11F802 High Efficiency Video Coding (HEVC) Test Model 54 (HM 54) Encoder
Description [K. McCann (primary), B. Bross, W.-J. Han, S. Sekiguchi,
G. J. Sullivan] (WG 11 N 1234185)
JCTVC-G11F803 High Efficiency Video Coding (HEVC) text specification Working Draft
54 [B. Bross (primary), W.-J. Han, G. J. Sullivan, J.-R. Ohm, T. Wiegand]
(WG 11 N 1234186)
JCTVC-G1210 Core Experiment 10: Deblocking filter [A. Norkin (primary), X. Guo,
B. Jeon, M. Narroschke]
JCTVC-G1211 Core Experiment 11: Coefficient scanning and coding [V. Sze (primary),
J. Chen, T. Nguyen, K. Panusopone, J. Sole]
JCTVC-F910 Core Experiment 10: Core Transforms [P. Topiwala (primary), M. Budagavi, A.
Fuldseth, R. Joshi, E. Alshina]
JCTVC-F911 Core Experiment 11: Coefficient Scanning and Coding [V. Sze (primary), J. Chen,
T. Nguyen, K. Panusopone, J. Sole]
JCTVC-F912 Core Experiment 12: Deblocking Filter [A. Norkin (primary), X. Guo, B. Jeon,
M. Narroschke]
continue
JCTVC-F913 Core Experiment 13: Motion data parsing robustness and throughput [J. Jung
(primary), B. Bross, J. Chen, P. Onno, M. Zhou]
combine with CE 9
New CE: Transform skipping (Marta Mrak)
Continuing CE with same chair(s) as before
CE descriptions to be reviewed Tues afternoon and/or Wed morning
74. Michael Horowitz (eBrisk Video, Inc.) 107. Jungsun Kim (LG electronics)
76. Chih-Wei Hsu (MediaTek Inc.) 109. Kenji Kondo (Sony Corporation)
77. Yu-Wen Huang (MediaTek) 110. Faouzi Kossentini (eBrisk Video Inc.)
78. Wai Lam Hui (The Hong Kong Polytechnic 111. Anand Kotra (Panasonic R&D Center
University) Germany GmbH)
79. Atsuro Ichigaya (NHK (Japan Broadcasting 112. Jumpei Koyama (FUJITSU
Corporation)) LABORATORIES LTD.)
80. Kwon Jae Cheol (KT Corporation) 113. Thomas Kunlin (STMicroelectronics)
82. Byeong Moon Jeon (LG Electronics) 115. Polin Lai (Samsung)
84. Yongjoon Jeon (LG Electronics) 117. Guillaume Laroche (Canon Research Centre
France S.A.S)
85. Rajan Joshi (Qualcomm)
118. Fabrice Le Léannec (Canon Research Centre
86. Joel Jung (Orange Labs) France S.A.S)
87. Jiwook Jung (LG Electronics) 119. Ju Ock Lee (Sejong University)
88. Sungwook Jung (Korean Standards Association) 120. Jae Yung Lee (Sejong University)
89. Jewon Kang (Nokia) 121. Snagyoun Lee (Yonsei University)
90. Jung Won Kang (ETRI (Elecctronics and 122. Wonjae Lee (Samsung electronics)
Telecommunications Research
Institute)) 123. Jinho Lee (ETRI)
137. Jaehyun Lim (LG Electronics) 168. Matthias Narroschke (Panasonic R&D
Center Germany)
138. Sung-Chang Lim (ETRI)
169. Tung Nguyen (Fraunhofer Gesellschaft)
139. Chun-Lung Lin (ITRI International)
170. Takahiro Nishi (Panasonic)
140. Peter List (Deutsche Telekom)
171. Andrey Norkin (Ericsson)
141. Lingzhi Liu (Huawei Technologies USA)
172. Jens-Rainer Ohm (RWTH Aachen
142. Shan Liu (MediaTek) University)
143. Zhong Luo (Huawei Technologies) 173. Patrice Onno (Canon Research Centre
144. Ajay Luthra (Motorola Mobility Incs) France S.A.S)
146. Detlev Marpe (Fraunhofer HHI) 175. Krit Panusopone (Motorola Mobility)
147. Gaelle Martin-Cocher (Research in Motion) 176. Joonyoung Park (LG electronics)
148. Ikeda Masaru (Sony Corporation) 177. Seungwook Park (LG Electronics)
149. Masaaki Matsumura (NTT Corporation) 178. Youngo Park (SAMSUNG ELECTRONICS
Co., Ltd.)
150. Shohei Matsuo (NTT Corporation)
179. Dongjin Park (Chips&Media)
151. Ken McCann (ZetaCast/Samsung)
180. Sang-Hyo Park (Hanyang University)
152. Holger Meuel (Leibniz University
Hannover) 181. Jiho Park (KETI)
153. Akira Minezawa (Mitsubishi Electric 182. Jeonghoon Park (Samsung Electronics Co.,
Corporation) Ltd.)
154. Koohyar Minoo (Motorola Mobility Inc.) 183. Nicolas Pellerin (ST microelectronics)
157. Seiji Mochizuki (Renesas Electronics 186. Yinji Piao (Samsung Electronics)
Corporation) 187. Matthias Preiß (Fraunhofer Heinrich Hertz
158. Dhawal Moghe (Cable Television Labs) Institut (HHI))
159. Fulvio Moschetti (European Patent Office) 188. Mohamad Raad (RaadTech Consulting)
161. Tokumichi Murakami (Mitsubishi Electric 190. Justin Ridge (Nokia Oyj)
Corporation) 191. Arturo Rodriguez (Cisco Systems)
192. Francisco Javier Roncero (Cavium, Inc.)
205. Heiko Schwarz (Fraunhofer Gesellschaft zur 236. Ikai Tomohiro (Sharp Corporation)
Förderung der angewandten Forschung 237. Pankaj Topiwala (FastVDO LLC)
(FGFF))
238. Matsunobu Toru (Panasonic)
206. Andrew Segall (Sharp Corporation)
239. Cong-Thang Truong (University of Aizu)
207. Shun-Ichi Sekiguchi (Mitsubishi Electric
Corporation) 240. Yi-Shin Tung (MStar Semiconductor, Inc /
ITRI International Inc.)
208. Stian Selnes (Cisco Systems)
241. Kemal Ugur (Nokia)
209. Vadim Seregin (Qualcomm Incorporated)
242. Geert Van Der Auwera (Qualcomm Inc.)
210. Osman Gokhan Sezer (Texas Instruments
Inc.) 243. Glenn Van Wallendael (Ghent University -
IBBT - Multimedia Lab)
211. Karl Sharman (Broadcast & Professional
Research Labs, Sony Europe Ltd.) 244. Jérôme Vieron (ATEME)
213. Masato Shima (Canon Inc.) 246. Wade Wan (Broadcom Corporation)