JCTVC-G Notes d9

Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 253

Joint Collaborative Team on Video Coding (JCT-VC)

Document: JCTVC-
of ITU-T SG16 WP3 and ISO/IEC JTC1/SC29/WG11
G_Notes_d97
7th Meeting: Geneva, CH, 21-30 Nov 2011

Title: Meeting report of the seventh meeting of the Joint Collaborative Team on Video
Coding (JCT-VC), Geneva, CH, 21-30 Nov. 2011
Status: Report Document from Chairs of JCT-VC
Purpose: Report
Author(s) or Gary Sullivan
Contact(s): Microsoft Corp. Tel: +1 425 703 5308
1 Microsoft Way Email: garysull@microsoft.com
Redmond, WA 98052 USA
Jens-Rainer Ohm
Institute of Communications Engineering Tel: +49 241 80 27671
RWTH Aachen University Email: ohm@ient.rwth-aachen.de
Melatener Straße 23
D-52074 Aachen
Source: Chairs
_____________________________

Summary
[qq J. Boyce to coordinate BoG on AHG21]
The Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T WP3/16 and ISO/IEC
JTC 1/SC 29/WG 11 held its seventh meeting during 21-30 Nov 2011 at the ITU-T premises in Geneva,
CH. During the first two days of the meeting, rooms at the WMO headquarters were also used. The JCT-
VC meeting was held under the chairmanship of Dr. Gary Sullivan (Microsoft/USA) and Dr. Jens-Rainer
Ohm (RWTH Aachen/Germany). For rapid access to particular topics in this report, a subject
categorization is found in section 1.13 of this document.
The JCT-VC meeting sessions began at approximately 1100 hours on Monday 21 Nov 2011. Meeting
sessions were held on all days (including weekend days) until the meeting was closed at approximately
XXXX hours on Wednesday 30 Nov. Approximately XXX 284 people attended the JCT-VC meeting,
and approximately XXX 1000 input documents were discussed. The meeting took place in a co-located
fashion with a meeting of ITU-T SG16 – one of the two parent bodies of the JCT-VC. The subject matter
of the JCT-VC meeting activities consisted of work on the new next-generation video coding
standardization project now referred to as High Efficiency Video Coding (HEVC).
The primary goals of the meeting were to review the work that was performed in the interim period since
the sixth JCT-VC meeting in implementing the 4th HEVC Test Model (HM4) and editing the 4th HEVC
specification Working Draft (WD4), review the results from interim Core Experiments (CE), review
technical input documents, further develop Working Draft and HEVC Test Model (HM), and plan a new
set of Core Experiments (CEs) for further investigation of proposed technology.
The JCT-VC produced three particularly important output documents from the meeting: the HEVC Test
Model 5 (HM5), the HEVC specification Working Draft 5 (WD5), and a document specifying common
conditions and software reference configurations for HEVC coding experiments. Moreover, XX
documents describing the planning of future CEs were drafted.
For the organization and planning of its future work, the JCT-VC established XX "Ad Hoc Groups"
(AHGs) to progress the work on particular subject areas. The next four JCT-VC meetings are planned for
1–10 February 2012 under WG 11 auspices in San José, USA, 30 April 1–89 May 2012 under ITU-T

Page: 1 Date Saved: 2011-12-04


auspices in Geneva, CH, 11–20 July 2012 under WG 11 auspices in Stockholm, SE, and 10–19 Oct 2012
under WG 11 auspices in Suzhou, CN.
The document distribution site http://phenix.it-sudparis.eu/jct/ was used for distribution of all documents.
The reflector to be used for discussions by the JCT-VC and all of its AHGs is the JCT-VC reflector:
jct-vc@lists.rwth-aachen.de. For subscription to this list, see
http://mailman.rwth-aachen.de/mailman/listinfo/jct-vc.

Administrative topics
1.1 Organization
The ITU-T/ISO/IEC Joint Collaborative Team on Video Coding (JCT-VC) is a group of video coding
experts from the ITU-T Study Group 16 Visual Coding Experts Group (VCEG) and the ISO/IEC JTC 1/
SC 29/ WG 11 Moving Picture Experts Group (MPEG). The parent bodies of the JCT-VC are ITU-T
WP3/16 and ISO/IEC JTC 1/SC 29/WG 11.
The Joint Collaborative Team on Video Coding (JCT-VC) of ITU-T WP3/16 and ISO/IEC JTC 1/ SC 29/
WG 11 held its seventh meeting during 21-30 Nov 2011 at the ITU-T premises in Geneva, CH. The JCT-
VC meeting was held under the chairmanship of Dr. Gary Sullivan (Microsoft/USA) and Dr. Jens-Rainer
Ohm (RWTH Aachen/Germany).

1.2 Meeting logistics


The JCT-VC meeting sessions began at approximately 1100 hours on Monday 21 Nov 2011. Meeting
sessions were held on all days (including weekend days) until the meeting was closed at approximately
XXXX hours on Wedesday 30 Nox. Approximately XXX people attended the JCT-VC meeting, and
approximately XXX input documents were discussed. The meeting took place in a co-located fashion
with a meeting of of ITU-T SG16 – one of the two parent bodies of the JCT-VC. The subject matter of
the JCT-VC meeting activities consisted of work on the new next-generation video coding standardization
project now referred to as High Efficiency Video Coding (HEVC).
Some statistics for historical reference purposes:
 1st meeting (Dresden): 188 people, 40 input documents
 2nd meeting (Geneva): 221 people, 120 input documents
 3rd meeting (Guangzhou): 244 people, 300 input documents
 4th meeting (Daegu): 248 people, 400 input documents
 5th meeting (Geneva): 226 people, 500 input documents
 6th meeting (Torino): 254 people, 700 input documents
 7th meeting (Geneva) 284XXX people, XXX 1000 input documents

Information regarding logistics arrangements for the meeting had been provided at
http://wftp3.itu.int/av-arch/jctvc-site/2011_11_G_Geneva/JCTVC-G_Logistics.doc.

1.3 Primary goals


The primary goals of the meeting were to review the work that was performed in the interim period since
the sixth JCT-VC meeting in producing the 4th HEVC Test Model (HM) software and editing the 4 th
HEVC specification Working Draft (WD4), review the results from interim Core Experiments (CEs),

Page: 2 Date Saved: 2011-12-04


review technical input documents, and establish fifth versions of the Working Draft (WD5) and HEVC
Test Model (HM5).

1.4 Documents and document handling considerations

1.4.1 General
The documents of the JCT-VC meeting are listed in Annex A of this report. The documents can be found
at http://phenix.it-sudparis.eu/jct/.
Registration timestamps, initial upload timestamps, and final upload timestamps are listed in Annex A of
this report.
Document registration and upload times and dates listed in Annex A and in headings for documents in
this report are in Paris/Geneva time. Dates mentioned for purposes of describing events at the meeting
(rather than as contribution registration and upload times) follow the local time at the meeting facility.
Decisions made by the group that affect the normative content of the draft standard are identified in this
report by prefixing the description of the decision with the string "Decision:". Decisions that affect the
reference software but have no normative effect on the text are marked by the string "Decision (SW):".
This meeting report is based primarily on notes taken by the chairs and projected for real-time review by
the participants during the meeting discussions. The preliminary notes were also circulated publicly by ftp
during the meeting on a daily basis. Considering the high workload of this meeting and the large number
of contributions, it should be understood by the reader that 1) some notes may appear in abbreviated form,
2) summaries of the content of contributions are often based on abstracts provided by contributing
proponents without an intent to imply endorsement of the views expressed therein, and 3) the depth of
discussion of the content of the various contributions in this report is not uniform. Generally, the report is
written to include as much discussion of the contributions and discussions as is feasible in the interest of
aiding study, although this approach may not result in the most polished output report.

1.4.2 Late and incomplete document considerations


The formal deadline for registering and uploading non-administrative contributions had been announced
as Tuesday, 8 Nov 2011.
Non-administrative documents uploaded after 2359 hours in Paris/Geneva time Wednesday Nov 9 were
considered "officially late".
Most documents in this category were CE reports or cross-verification reports, which are somewhat less
problematic than late proposals for new action (and especially for new normative standardization action).
At this meeting, we again had a substantial amount of late document activity, but in general the early
document deadline gave us a significantly better chance for thorough study of documents that were
delivered in a timely fashion. The group strived to be conservative when discussing and considering the
content of late documents, although no objections were raised regarding allowing some discussion in such
cases.
The following documents that did not arrive until the last two days of the meeting were not presented or
discussed due to time constraints: JCTVC-Gxxx, … .
All contribution documents with registration numbers JCTVC-G867 to JCTVC-Gxxx were registered
after the "officially late" deadline (and therefore were also uploaded late). Some documents in this range
include break-out activity reports that were generated during the meeting and are therefore considered
report documents rather than late contributions.
In many cases, contributions were also revised after the initial version was uploaded. The contribution
document archive website retains publicly-accessible prior versions in such cases. The timing of late
document availability for contributions is generally noted in the section discussing each contribution in
this report.

Page: 3 Date Saved: 2011-12-04


The following other technical proposal contributions were registered in time but were uploaded late:
 JCTVC-Gxxx (a technical proposal) [uploaded xx-xx]
 …
The following other document not proposing normative technical content was registered in time but
uploaded late:
 JCTVC-Gxxx (a study on screen content test sequences)
The following cross-verification reports were uploaded late: JCTVC-Gxxx, … .
The following document registrations were later cancelled or otherwise never discussed or provided:
JCTVC-Gxxx, … .
Ad hoc group interim activity reports, CE summary results reports, break-out activity reports, and
information documents containing the results of experiments requested during the meeting are not
included in the above list, as these are considered administrative report documents to which the uploading
deadline is not applied.
As a general policy, missing documents were not to be presented, and late documents (and substantial
revisions) could only be presented when sufficient time for studying was given after the upload. Again, an
exception is applied for AHG reports, CE summaries, and other such reports which can only be produced
after the availability of other input documents. There were no objections raised by the group regarding
presentation of late contributions.
"Placeholder" contribution documents that were basically empty of content, with perhaps only a brief
abstract and some expression of an intent to provide a more complete submission as a revision, were
considered unacceptable and were rejected in the document management system, as has been agreed since
the third meeting.
The initial uploads of the following contribution documents were rejected as "placeholders" and were not
corrected until after the upload deadline:
 JCTVC-Gxxx (a cross-verification report, corrected 2011-xx-xx)
 …
A few contributions had some problems relating to IPR declarations in the initial uploaded versions
(missing declarations, declarations saying they were from the wrong companies, etc.). These issues were
corrected by later uploaded versions in all cases (to the extent of the awareness of the chairs).

1.4.3 Measures to facilitate the consideration of contributions


It was agreed that, due to the increasingly high workload for this meeting, the group would try to rely
more extensively on summary CE reports. For other contributions, it was agreed that generally
presentations should not exceed 5 minutes to achieve a basic understanding of a proposal – with further
review only if requested by the group. For cross-verification contributions, it was agreed that the group
would ordinarily only review cross-checks for proposals that appear promising.
When considering cross-check contributions, it was agreed that, to the extent feasible, the following data
should be collected:
 Subject (including document number).
 Whether common conditions were followed.
 Whether the results are complete.
 Whether the results match those reported by the contributor (within reasonable limits, such as
minor compiler/platform differences).

Page: 4 Date Saved: 2011-12-04


 Whether the contributor studied the algorithm and software closely and has demonstrated
adequate knowledge of the technology.
 Whether the contributor independently implemented the proposed technology feature, or at least
compiled the software themselves.
 Any special comments and observations made by the cross-check contributor.

1.4.4 Outputs of the preceding meeting


The report documents of the previous meeting, particularly the meeting report JCTVC-F800, the HEVC
Test Model (HM) JCTVC-F802, and the Working Draft (WD) JCTVC-F803, were approved. The HM
reference software produced by the AHG on software development and HM software technical evaluation
was also approved.
Versions of the WD, the HM document, the HM software and the CE descriptions had been made
available in a reasonably timely fashion.
The chair asked if there were any issues regarding potential mismatches between perceived technical
content prior to adoption and later integration efforts. It was also asked whether there was adequate clarity
of precise description of the technology in the associated proposal contributions.
Some such issues had been brought up on the reflector for group clarification on how to proceed.
It was remarked that in some cases (none specifically mentioned) the software implementation of adopted
proposals revealed that the description that had been the basis of the adoption apparently was not precise
enough, so that the software unveiled details that were not known before (except possibly for CE
participants who had studied the software). Also, there should be time to study combinations of different
adopted tools with more detail prior to adoption.
CE descriptions need to be fully precise – this is intended as a method of enabling full study and testing
of a specific technology.
Greater discipline in terms of what can be established as a CE may be an approach to helping with such
issues. CEs should be more focused on testing just a few specific things, and the description should
precisely define what is intended to be tested (available by the end of the meeting when the CE plan is
approved).
Software study can be a useful and important element of adequate study; however, software availability is
not a proper substitute for document clarity.
The activities in some CEs may have diverged from the original plans by bringing in somewhat different
technology that may not have been fully understood even by the cross-checking participants.
Software shared for CE purposes needs to be available with adequate time for study. Software of CEs
should be available early, to enable close study by cross-checkers (not just provided shortly before the
document upload deadline).
CE9 was suggested as a CE where there has been a need for greater discipline and where the situation
became confusing.
Issues of combinations between different features (e.g., different adopted features) also tend to sometimes
arise in the work.

1.5 Attendance
The list of participants in the JCT-VC meeting can be found in Annex B of this report.
The meeting was open to those qualified to participate either in ITU-T WP3/16 or ISO/IEC JTC 1/ SC 29/
WG 11 (including experts who had been personally invited by the Chairs as permitted by ITU-T or
ISO/IEC policies).

Page: 5 Date Saved: 2011-12-04


Participants had been reminded of the need to be properly qualified to attend. Those seeking further
information regarding qualifications to attend future meetings may contact the Chairs.

1.6 Agenda
The agenda for the meeting was as follows:
 IPR policy reminder and declarations
 Contribution document allocation
 Reports of ad hoc group activities
 Reports of Core Experiment activities
 Review of results of previous meeting
 Consideration of contributions and communications on HEVC project guidance
 Consideration of HEVC technology proposal contributions
 Consideration of information contributions
 Coordination activities
 Future planning: Determination of next steps, discussion of working methods, communication
practices, establishment of coordinated experiments, establishment of AHGs, meeting planning,
refinement of expected standardization timeline, other planning issues
 Other business as appropriate for consideration

1.7 IPR policy reminder


Participants were reminded of the IPR policy established by the parent organizations of the JCT-VC and
were referred to the parent body websites for further information. The IPR policy was summarized for the
participants.
The ITU-T/ITU-R/ISO/IEC common patent policy shall apply. Participants were particularly reminded
that contributions proposing normative technical content shall contain a non-binding informal notice of
whether the submitter may have patent rights that would be necessary for implementation of the resulting
standard. The notice shall indicate the category of anticipated licensing terms according to the
ITU-T/ITU-R/ISO/IEC patent statement and licensing declaration form.
This obligation is supplemental to, and does not replace, any existing obligations of parties to submit
formal IPR declarations to ITU-T/ITU-R/ISO/IEC.
Participants were also reminded of the need to formally report patent rights to the top-level parent bodies
(using the common reporting form found on the database listed below) and to make verbal and/or
document IPR reports within the JCT-VC as necessary in the event that they are aware of unreported
patents that are essential to implementation of a standard or of a draft standard under development.
Some relevant links for organizational and IPR policy information are provided below:
 http://www.itu.int/ITU-T/ipr/index.html (common patent policy for ITU-T, ITU-R, ISO, and IEC,
and guidelines and forms for formal reporting to the parent bodies)
 http://ftp3.itu.int/av-arch/jctvc-site (JCT-VC contribution templates)
 http://www.itu.int/ITU-T/studygroups/com16/jct-vc/index.html (JCT-VC general information and
founding charter)
 http://www.itu.int/ITU-T/dbase/patent/index.html (ITU-T IPR database)
 http://www.itscj.ipsj.or.jp/sc29/29w7proc.htm (JTC 1/ SC 29 Procedures)

Page: 6 Date Saved: 2011-12-04


It is noted that the ITU TSB director's AHG on IPR had issued a clarification of the IPR reporting process
for ITU-T standards, as follows, per SG 16 TD 327 (GEN/16):
“TSB has reported to the TSB Director’s IPR Ad Hoc Group that they are receiving Patent Statement
and Licensing Declaration forms regarding technology submitted in Contributions that may not yet be
incorporated in a draft new or revised Recommendation. The IPR Ad Hoc Group observes that, while
disclosure of patent information is strongly encouraged as early as possible, the premature submission
of Patent Statement and Licensing Declaration forms is not an appropriate tool for such purpose.
In cases where a contributor wishes to disclose patents related to technology in Contributions, this can
be done in the Contributions themselves, or informed verbally or otherwise in written form to the
technical group (e.g. a Rapporteur’s group), disclosure which should then be duly noted in the
meeting report for future reference and record keeping.
It should be noted that the TSB may not be able to meaningfully classify Patent Statement and
Licensing Declaration forms for technology in Contributions, since sometimes there are no means to
identify the exact work item to which the disclosure applies, or there is no way to ascertain whether
the proposal in a Contribution would be adopted into a draft Recommendation.
Therefore, patent holders should submit the Patent Statement and Licensing Declaration form at the
time the patent holder believes that the patent is essential to the implementation of a draft or approved
Recommendation.”
The chairs invited participants to make any necessary verbal reports of previously-unreported IPR in draft
standards under preparation, and opened the floor for such reports: No such verbal reports were made.

1.8 Software copyright disclaimer header reminder


It was noted that, as had been agreed at the 5th meeting of the JCT-VC and approved by both parent
bodies at their collocated meetings at that time, the HEVC reference software copyright license header
language is the BSD license with preceding sentence declaring that contributor or third party rights are
not granted, as recorded in N10791 of the 89th meeting of ISO/IEC JTC 1/ SC 29/ WG 11. Both ITU and
ISO/IEC will be identified in the <OWNER> and <ORGANIZATION> tags in the header. This software
is used in the process of designing the new HEVC standard and for evaluating proposals for technology to
be included in this design. Additionally, after development of the coding technology, the software will be
published by ITU-T and ISO/IEC as an example implementation of the HEVC standard and for use as the
basis of products to promote adoption of the technology.
Different copyright statements shall not be committed to the committee software repository (in the
absence of subsequent review and approval of any such actions). As noted previously, it must be further
understood that any initially-adopted such copyright header statement language could further change in
response to new information and guidance on the subject in the future.

1.9 Communication practices


The documents for the meeting can be found at http://phenix.it-sudparis.eu/jct/. For the first two JCT-VC
meetings, the JCT-VC documents had been made available at http://ftp3.itu.int/av-arch/jctvc-site, and
documents for the first two JCT-VC meetings remain archived there. That site was also used for
distribution of the contribution document template and circulation of drafts of this meeting report.
JCT-VC email lists are managed through the site http://mailman.rwth-aachen.de/mailman/options/jct-vc,
and to send email to the reflector, the email address is jct-vc@lists.rwth-aachen.de. Only members of the
reflector can send email to the list. However, membership of the reflector is not limited to qualified JCT-
VC participants.

Page: 7 Date Saved: 2011-12-04


It was emphasized that reflector subscriptions and email sent to the reflector must use their real names
when subscribing and sending messages and must respond to inquiries regarding their type of interest in
the work.
It was emphasized that usually discussions concerning CEs and AHGs should be performed using the
reflector. CE internal discussions should primarily be concerned with organizational issues. Substantial
technical issues that are not reflected by the original CE plan should be openly discussed on the reflector.
Any new developments that are result of private communication cannot be considered as result of the CE.
For the case of CE documents and AHG reports, email addresses of participants and contributors may be
obscured or absent (and will be on request), although these will be available (in human readable format –
possibly with some "obscurification") for primary CE coordinators and AHG chairs.

1.10 Terminology
Some terminology used in this report is explained below:
 AHG: Ad hoc group.
 AI: All-intra.
 AIF: Adaptive interpolation filtering.
 AIS: Adaptive intra smoothing.
 ALF: Adaptive loop filter.
 AMP: Asymmetric motion partitioning.
 APS: Adapation parameter set.
 AMVR: Adaptive motion vector resolution.
 AVC: Advanced video coding – the video coding standard formally published as ITU-T
Recommendation H.264 and ISO/IEC 14496-10.
 BA: Block adaptive.
 BD: Bjøntegaard-delta – a method for measuring percentage bit rate savings at equal PSNR or
decibels of PSNR benefit at equal bit rate (e.g., as described in document VCEG-M33 of April
2001).
 BoG: Break-out group.
 BR: Bit rate.
 BUDI: Bidirectional UDI.
 CABAC: Context-adaptive binary arithmetic coding.
 CBF: Coded block flag(s).
 CE: Core experiment – a coordinated experiment conducted after the 3rd or 4th meeting.
 DCT: Discrete cosine transform (sometimes used loosely to refer to other transforms with
conceptually similar characteristics).
 DCTIF: DCT-derived interpolation filter.
 DIF: Directional interpolation filter.
 DF: Deblocking filter.
 DT: Decoding time.
 EPB: Emulation prevention byte (as in the emulation_prevention_byte syntax element).

Page: 8 Date Saved: 2011-12-04


 ET: Encoding time.
 GPB: Generalized P/B – a not-particularly-well-chosen name for B pictures in which the two
reference picture lists are identical.
 HE: High efficiency – a set of coding capabilities designed for enhanced compression
performance (contrast with LC). Often loosely associated with RA.
 HEVC: High Efficiency Video Coding – the video coding standardization initiative under way in
the JCT-VC.
 HM: HEVC Test Model – a video coding design containing selected coding tools that constitutes
our draft standard design – now also used especially in reference to the (non-normative) encoder
algorithms (see WD and TM).
 IBDI: Internal bit-depth increase – a technique by which lower bit depth (8 bits per sample)
source video is encoded using higher bit depth signal processing, ordinarily including higher bit
depth reference picture storage (ordinarily 12 bits per sample).
 JM: Joint model – the primary software codebase developed for the AVC standard.
 LB or LDB: Low-delay B – the variant of the LD conditions that uses B frames.
 LC: Low complexity – a set of coding capabilities designed for reduced implementation
complexity (contrast with HE). Often loosely associated with LD.
 LCEC: Low-complexity entropy coding.
 LD: Low delay – one of two sets of coding conditions designed to enable interactive real-time
communication, with less emphasis on ease of random access (contrast with RA). Often loosely
associated with LC. Typically refers to LB, although also applies to LP.
 LM: Linear model.
 LP or LDP: Low-delay P – the variant of the LD conditions that uses P frames.
 LUT: Look-up table.
 MC: Motion compensation.
 MDDT: Mode-dependent directional transform.
 MPEG: Moving picture experts group (WG 11, the parent body working group in ISO/IEC
JTC 1/ SC 29, one of the two parent bodies of the JCT-VC).
 MRG: block merging mode for CUs.
 MV: Motion vector.
 NAL: Network abstraction layer (as in AVC).
 NB: National body (usually used in reference to NBs of the WG 11 parent body).
 NSQT: Non-square quadtree.
 NUT: NAL unit type (as in AVC).
 OBMC: Overlapped block motion compensation.
 PCP: Parallelization of context processing.
 PIPE: Probability interval partitioning entropy coding (roughly synonymous with V2V for most
discussion purposes, although the term PIPE tends to be more closely associated with proposals
from Fraunhofer HHI while the term V2V tends to be more closely associated with proposals
from RIM).
 POC: Picture order count.

Page: 9 Date Saved: 2011-12-04


 PPS: Picture parameter set (as in AVC).
 QP: Quantization parameter.
 QT: Quadtree.
 RA: Random access – a set of coding conditions designed to enable relatively-frequent random
access points in the coded video data, with less emphasis on minimization of delay (contrast with
LD). Often loosely associated with HE.
 R-D: Rate-distortion.
 RDO: Rate-distortion optimization.
 RDOQ: Rate-distortion optimized quantization.
 RPLM: Reference picture list modification.
 ROT: Rotation operation for low-frequency transform coefficients.
 RQT: Residual quadtree.
 RVM: Rate variation measure.
 SAO: Sample-adaptive offset.
 SDIP: Short-distance intra prediction.
 SEI: Supplemental enhancement information (as in AVC).
 SPS: Sequence parameter set (as in AVC).
 TE: Tool Experiment – a coordinated experiment conducted after the 1st or 2nd JCT-VC
meeting.
 TM: Test Model – a video coding design containing selected coding tools; as contrasted with the
TMuC, see HM.
 TMuC: Test Model under Consideration – a video coding design containing selected proposed
coding tools that are under study by the JCT-VC for potential inclusion in the HEVC standard.
 TPE: Transform precision extension.
 UDI: Unified directional intra.
 Unit types:
o CU: coding unit.
o LCU: (formerly LCTU) largest coding unit (synonymous with TB).
o PU: prediction unit, with four shape possibilities.
 2Nx2N: having the full width and height of the CU.
 2NxN: having two areas that each have the full width and half the height of the
CU.
 Nx2N: having two areas that each have half the width and the full height of the
CU.
 NxN: having four areas that each have half the width and half the height of the
CU.
o TB: tree block (synonymous with LCU – LCU seems preferred).
o TU: transform unit.

Page: 10 Date Saved: 2011-12-04


 V2V: variable-length to variable-length prefix coding (roughly synonymous with PIPE for most
discussion purposes, although the term PIPE tends to be more closely associated with proposals
from Fraunhofer HHI while the term V2V tends to be more closely associated with proposals
from RIM).
 VCEG: Visual coding experts group (ITU-T Q.6/16, the relevant rapporteur group in ITU-T
WP3/16, which is one of the two parent bodies of the JCT-VC).
 WD: Working draft – the draft HEVC standard corresponding to the HM.
 WG: Working group (usually used in reference to WG 11, a.k.a. MPEG).

1.11 Liaison activity


The JCT-VC did not send or receive formal liaison communications at this meeting.

1.12 Opening remarks


No particular non-routine opening remarks were recorded.
qq S. Wenger volunteered to help with document archive coordination.
qq Discussion of desirability of reflector discussions.
qq Discussion of CE internal discussions.

1.13 Contribution topic overview


The approximate subject categories and quantity of contributions per category for the meeting were
summarized and categorized into "tracks" (A, B, or P) for "parallel session A", "parallel session B", or
"Plenary" review, as follows. Discussions on topics categorized as "Track A" were primarily chaired by
Jens-Rainer Ohm, and discussions on topic categorized as "Track B" were primarily chaired by Gary
Sullivan.
Note: Counts may not be 100% precise
 AHG reports (22) Track P (section 2) [BoG on chroma format]
 Project development, status, and guidance (0) Track P (section 3)
 CE summary reports (13) – Reviewed in plenary or tracks
 CE1: Entropy coding investigation (26) Track A (section 4.1)
 CE2: Motion partitioning and OBMC (6) Track A (section 4.2)
 CE3: Motion compensation (33) Track A (section 4.3)
 CE4: Quantization (45) Track B (section 4.4) [BoG, further CE planned]
 CE5: CAVLC entropy coding improvement (13) Track B (section 4.5)
 CE6: Intra prediction improvement (48) Track B (section 4.6)
 CE7: Additional transforms (14) Track B (section 4.7)
 CE8: Non-deblocking loop filtering (36) Track A (section 4.8)
 CE9: MV coding and skip/merge operation (30) Track A (section 4.9)
 CE10: Core transform design (8) Track B (section 4.10) [CE suggested]
 CE11: Coefficient scanning and coding (15) Track B (section 4.11)

Page: 11 Date Saved: 2011-12-04


 CE12: Deblocking filter (38) Track A (section 4.12)
 CE13: Motion data parsing robustness and throughput (17) Track A (section 4.13)
 Clarifications and bug fix issues (1) Track P (section 5.1)
 HM settings and common test conditions (8) Track P (section 5.2)
 Source video test material (1) Track P (section 5.3)
 Functionalities (10) Track A (section 5.4)
 Loop filtering (85) Track A (section 5.5)
 Block structures and partitioning (22) Track A (section 5.6)
 Motion compensation operation and interpolation filters (31) Track A (section 5.7)
 Motion and mode coding (81) Track A (section 5.8) incl. parsing robustness (previous 5.17)
 High-level syntax and slice structure (54) Track B (section 5.9) [BoG on RPS]
 Quantization (35) Track B (section 5.10)
 Alternative coding modes (22) Track A (section 5.11)
 Entropy coding (38) Track A (section 5.12)
 Transform coefficient coding (60) Track B (section 5.13) [BoG J. Sole on NSQT Harm., plan
CE11]
 Intra prediction and mode coding (72) Track B (section 5.14)
 Transforms (31) Track B (section 5.15)
 IBDI and memory compression (2) Track A (section 5.16)
 Complexity assessment (3) Track A (section 5.17)
 Encoder optimization (6) Track P (section 5.18)
 Category not clear (1 to be resolved) (section 5.20)
Overall Track P: 59; Track A: 428; Track B: 439

2 AHG reports
The activities of ad hoc groups that had been established at the prior meeting are discussed in this section.

JCTVC-G001 JCT-VC AHG Report: Project Management (AHG 1) [G. J. Sullivan, J.-R.
Ohm (AHG chairs)]
This document reports on the work of the JCT-VC ad hoc group on Project Management.
The work of the JCT-VC overall has proceeded well in the interim period. A large amount of discussion
was carried out on the group email reflector. All report documents from the preceding meeting have been
made available at the ITU-based JCT-VC site (http://ftp3.itu.int/av-arch/jctvc-site/2011_07_F_Torino) or
the new "Phenix" site (http://phenix.it-sudparis.eu/jct/), particularly including the following:
 The meeting report (JCTC-F800)
 The HM 4 encoder description (JCTVC-F802)
 The HEVC Working Draft (JCTVC-F803)
 Common HM test conditions and software reference configurations (JCTVC-F900)
Page: 12 Date Saved: 2011-12-04
 Finalized core experiment descriptions (JCTVC-F901 through JCTVC-F913)
Additional important current JCT-VC documents are noted as follows:
 HEVC software guidelines (JCTVC-F688)
 HEVC Reference Software Manual (JCTVC-F634)
The various ad hoc groups and tool experiments have made progress, and various reports from those
activities have been submitted.
Since the approval of software copyright header language at the March 2011 parent-body meetings, this
topic seems to be resolved.
No major news has been received regarding future meeting plans, etc.
No particular problems were noted with the produced outputs in this discussion.

JCTVC-G002 JCT-VC AHG report: HEVC Draft and Test Model editing (AHG 2) [B.
Bross, K. McCann, W.-J. Han, J.-R. Ohm, S. Sekiguchi, G. J. Sullivan, T.
Wiegand]
One draft of JCTVC-F802 and six drafts of JCTVC-F803 were published by the Editing AHG between
the 6th JCT-VC meeting in Torino (14-22 July, 2011) and the 7th Meeting in Geneva (21-30 November,
2011). JCTVC-F802 still needs significant further improvement, whilst the final draft of JCTVC-F803 is
reasonably complete.
The main changes in JCTVC-F803, relative to the previous JCTVC-E603, were listed. Some specific
open issues remaining for JCTVC-F803 were also noted.
NSQT integration was particularly difficult. A mismatch was reported between the software and the text
submitted by the proponents. It was noted that there are relevant input contributions to address this.
In the discussion, the importance of confirming the correctness of text when performing cross-checking of
proposals was emphasized.
CAVLC proposals not yet integrated into WD.
Tiles, wavefronts, and weighted prediction had not yet been integrated, not necessarily due to problems
with those aspects, but rather due to the scheduling of other activities that preceded it in integration order.
The work was prioritized to first integrate aspects likely to affect coding efficiency behaviour.
The general list of key HEVC issues that need to be addressed was identified to be:
 Entropy coding architecture (see AHG9)
 Transform and dynamic range (see AHG5 and AHG7)
 Picture buffering and high-level syntax (see AHG21)
 Picture resolution adaptation (see AHG18)
 Non-4:2:0 colour formats (see AHG20)
 10 bit vs. 8 bit decoding capability
 Simplification of MV coding
 In-loop filtering clean-up
 Profiles and Levels
 Parallel processing clean-up

Page: 13 Date Saved: 2011-12-04


A particular recommendation was to ensure that, when considering the addition of new tools to HEVC,
properly drafted text for addition to both the HEVC Working Draft and the HM Test Model (if
appropriate) is made available in a timely.

NSQT: Deviation between text and software, more difficult to integrate than anticipated
CAVLC proposals not integrated yet
Tools from H4.1 are not yet included (not due to technical problems):
- Tiles and wavefront
- Weighted prediction
Encoder description should become mandatory

JCTVC-G003 JCT-VC AHG report: Software development and HM software technical


evaluation (AHG 3) [F. Bossen, D. Flynn, K. Suehring] [miss]
Reported verbally prior to upload
First version 4.0 (primarily items affecting coding efficiency) – was integrated quickly, and individual
tools were re-tested to confirm that the results were as expected. This generally worked well.
As noted above, NSQT integration was particularly difficult.
The encoder behaviour for AMP was noted to be different than what may have been expected, as further
discussed in CE9.
The next version (4.1) was developed more slowly than expected, particularly due to difficulty of
integrating the tiles and wavefronts features. Weighted prediction was also included in the 4.1 version.
The re-testing to confirm the intended behaviour was reported to not yet have been completed. Some drop
in coding efficiency was reported for the 4.1 version – not so much – perhaps 0.5%, but this should be
investigated to determine why.
The software coordinator indicated concerns about the quality of some of the software submissions.
The coordinator emphasized the importance of filing bug reports (and providing the associated fixes) for
tracking any observed problems with the software.
It was noted that in addition to testing for the behaviour when using the common conditions, it is also
important to test with other conditions (e.g. higher bit rates).
In the discussion, it was noted that slice boundary padding behaviour for ALF had changed, and that the
previous behaviour may be better to use. Another participant remarked that the change had been made in
order to make the software match the text. [action plan?]See below for action on this.
The software coordinator remarked that the ALF software was particularly poor quality code, and needs
lots of work. He also said that the deblocking filter code has substantial problems. [action plan?]This was
the subject of some BoG work and also put into AHG plans.

Cross-checkers of software should confirm that the WD text matches software (NSQT case). One issue
related to AMP (see CE9), where encoder behaviour was different than expected.
4.0 developed as planned, but 4.1 (particularly tiles and wavefront) more difficult to implement than
expected. (also loses approx. 0.5% due to overhead)
Concerns about quality of some delivered submissions
Any bug reports should be filed by ticket, not just verbally. Experts are encouraged/urged not only to
report bugs, but also to contribute fixing them.

Page: 14 Date Saved: 2011-12-04


Slice boundary padding order was changed from 4.0 to 4.1 (mismatch between text and software). Some
concern wais raised regarding whether this change was appropriate. This was discussed again on Nov. 29.
Decision (SW): Revert software and WD to 4.0 version of the padding.
In general, ALF and de-blocking not in good shape.

JCTVC-G004 JCT-VC AHG report: Picture Partitioning and LCU scan order (AHG4) [R.
Sjöberg (AHG chair), Y. Chen, F. Henry, M. Horowitz, K. Kazui, A. Segall
(vice chairs)]
Main focus: Combination wavefront and tiles. No conclusion in reflector discussions. Could be a
profiling issue which combinations are allowed
Investigation on slice overhead (in HM 4.1) unveiled that nothing changed.

JCTVC-G005 JCT-VC AHG Report: Spatial Transforms (AHG 5) [P. Topiwala (AHG
Chair), M. Budagavi, R. Cohen, R. Joshi (vice chairs)]
The report should not list a “membership” of the AHG
Related CE7 / CE10
Problem with precision at low QP? To be further clarified

JCTVC-G006 JCT-VC AHG report: In-loop and post-processing filtering (AHG 6) [T.
Yamakage, K. Chono, Y. J. Chiu, I. S. Chong, M. Narroschke]
Recommends to study line buffer reduction jointly for all loop filters in a BoG
Recommends to discuss some CE related contributions (G211, G212, G656 and G691) in the context of
CE8 (also 499?)
Recommends to work on clean up of software and text

JCTVC-G007 JCT-VC AHG report: Transform dynamic range (AHG 7) [A. Segall
(Sharp), E. Alshina (Samsung)]
Results of email discussion:
 Dynamic range restriction should be defined for the dequantized coefficients.
 Dynamic range restriction should be defined after first inverse transform
 Dynamic range restriction should not be defined after second inverse transform, if input to second
inverse transform is restricted to 16-bits
 Dynamic range following the first transform can exceed 16-bit in the worst case
 Clipping is preferred to restrict dynamic range for the dequantized coefficients and after first
inverse transform.
Recommendations:

 Review input contributions at the 7th JCT-VC meeting


 Include dynamic range restrictions for dequantized coefficients and first inverse transform in
HEVC design
 Do not include dynamic range restriction following second inverse transform in HEVC design
Two experts comment that limiting at the output of the second transform could also be considered, but
according to previous discussions might not have relevant benefit.

Page: 15 Date Saved: 2011-12-04


JCTVC-G008 JCT-VC AHG report: Reference pictures memory compression (AHG 8) [K.
Chono (chair), T. Chujoh, D. Hoang, C. S. Lim, A. Tabatabai, M. Zhou (vice-
chairs)]
HM 4 slightly increased memory access bandwidth mostly in LD settings (most likely due to other
reference picture structure).
10 bit vs. 8 bit gives approx. 2.5-2.8% luma BR red., whereas memory bandwidth increases by 35% on
average and 50% maximum. (probably due to the relatively small access block sizes investigated and byte
alignment)

JCTVC-G009 JCT-VC AHG report: Entropy Coding Architecture (AHG 9) [K. McCann
(chair), A. Fuldseth, D. Marpe, A. Segall, K. Sugimoto, V. Sze, W. Wan, X.
Wang (vice chairs)]
Switchable (2 operating points) vs. scalable (multiple operating points).
Recommendations:

 HEVC should include only a single entropy coding technology with a single operating point
unless adding a second option provides a significantly different  performance/complexity trade-
off which substantially facilitates the use of HEVC in a class of applications for which it would
not otherwise provide an appropriate solution
 JCT-VC should analyse input contributions relating to entropy coding with the aim of making a
decision on the HEVC entropy coding architecture during the 7th JCT-VC meeting
Complexity (both hard and software) difficult to quantify.

JCTVC-G010 JCT-VC AHG report: Quantization (AHG 10) [M. Budagavi, M.


Karczewicz, G. Martin-Cocher, K. Sato]
Quantization-related contributions are categorized as follows:
- CE4 (general)
- CE4 subtest1: QP coding
- CE4 subtest2: De-quantization Offset
- CE4 subtest3: Quantization Matrices
- Non-CE4: QP Coding
- Non-CE4: De-quantization Offset
- Non-CE4: Quantization Matrices
The AHG recommends reviewing all relevant input contributions.

JCTVC-G011 JCT-VC AHG Report: Video test material selection (AHG 11) [T. Suzuki]
Under-represented: High bit depth, 4:4:4.
Sufficient variety of noise conditions?
Compressed material in class E
No new test material

Page: 16 Date Saved: 2011-12-04


JCTVC-G012 JCT-VC AHG report: Complexity assessment (AHG 12) [D. Alfonso (AHG
chair), J. Ridge, X. Wen (vice-chairs)]
Considering that complexity assessment experiments are almost totally executed in the scope of other Ad-
Hoc Groups and Core Experiments, rather than in this group, the chairs propose to discontinue the
activity of the Complexity Assessment AHG at the current meeting.
Pankaj T. expresses thanks to Daniele for contributions in CE10.

JCTVC-G013 JCT-VC AHG report: Screen Content Coding (AHG 13) [O. Au, J. Xu, H.
Yu (AHG chairs)]
Results with new screen content sequences with transform skipping (various input docs)
Is this AHG still needed?

JCTVC-G014 JCT-VC AHG Report (AHG14) [Stephan Wenger]


Error burst patterns with 3, 5, 7, 10% packet loss rate (see G150)
Decoder does no longer crash when a NAL packet is lost
Discussion: Adopt conditions of burst patterns? Even if they are not perfect, it would at least be more
useful than having anybody proposing error resilience tools uses his own. Discuss in context of G150.

JCTVC-G015 JCT-VC AHG report: High-level syntax (AHG 15) [Y. -K. Wang (chair), J.
Boyce, Y. Chen, M. M. Hannuksela, K. Kazui, T. Schierl, R. Sjöberg, T. K.
Tan, W. Wan (vice chairs)]
Include abstract & recommendations (no discussion)

JCTVC-G016 JCT-VC AHG report: Padding process (AHG 16) [V. Wahadaniah, K.
Chono, Y. Lin]
4 input docs (related to various aspects of padding in context of intra prediction)

JCTVC-G017 JCT-VC AHG Report: Scalable coding investigation (AHG 17) [J. Boyce, J.
Kang, K. Minoo, W. Wan, Y.-K. Wang]
Include abstract & recommendations (no discussion)

JCTVC-G018 JCT-VC AHG report: Resolution adaption (AHG 18) [T. Davies (AHG
chair), P. Topiwala, P. Wu (Vice-chairs)]
qq Potential synergy with scalability.
qq Usage for computational load management
Relation with scalability? Possibly
How to measure? PSNR is not useful across resolutions
Resolution adaptation is not only about better subjective quality, but also complexity adjustment.

JCTVC-G019 JCT-VC AHG Report: Transform Skipping (AHG19) [M. Mrak (AHG
chair), J. Sole, I.-K. Kim, J. Xu, H. Yu (vice chairs)]
The recommendations of the AHG are
 To study feasibility and effectiveness of integrated transform skipping - related proposals

Page: 17 Date Saved: 2011-12-04


 To discuss transform skipping options as default parameters for HM
 To study further transform skipping harmonization with NSQT tools
Some experts express interest, better understanding (e.g. complexity of entropy coding) – see docs 575,
577, 663.

JCTVC-G020 JCT-VC AHG report: Chroma format support (AHG 20) [David Flynn,
Dzung Hoang, Ken McCann]
Not much activity. Input doc (G967, G862) on the topic - discuss in breakout (D. Flynn).

JCTVC-G021 JCT-VC AHG report: Reference picture buffering and list construction
(AHG21) [D. Flynn, R. Sjöberg (AHG chairs), Y. Chen, T.K. Tan, W. Wan,
Y.-K. Wang (vice chairs)]
Open issues: long-term pictures, filling of reference picture lists, CRA issue.
qq Question re PPS versus APS usage – intent is to avoid sending the RPS in the slice header – wasn't
sure of what APS would ultimately be.
qq Question re detection of lost pictures
BoG [YKW & RS]  Later, J. Boyce.
WD text and software were developed and agreed (via reflector). Only loss of complete pictures is
supported.
Addressing of long-term pictures, construction of list and pictures following a CRA (can they reference
pictures before the CRA) are open issues.
Discussion: should the RPL information be in PPS? Or rather APS?
Lambda was adjusted – this could be of concern when used in CEs.
Source code not fully aligned with WD text
15 related input contributions (most build on top of the AHG WD text)
AHG recommendations:
 JCT-VC to review the candidate WD text on picture buffer management and consider it for adoption
 To use the HM-4.0-dev-ahg21-picbuffer source code for comparisons in picture buffer management
proposals
 JCT-VC to review all picture buffer management and list construction related input document.
BO: R. Sjoberg, YK Wang  Later, J. Boyce

JCTVC-G022 JCT-VC AHG report: Lossless Coding (AHG22) [W. Gao (chair), K. Chono,
J. Xu, M. Zhou (vice chairs)]
4 contributions on lossless coding (092, 093, 268, 664) – CE? Harmonization?
Locally lossless mode should also be considered (LCU level)

3 Project development, status, and guidance


See the section discussing functionalities for relevant contributions.

Page: 18 Date Saved: 2011-12-04


4 Core experiments
4.1 CE1: Entropy coding investigation (Track A)

4.1.1 Summary

JCTVC-G031 CE1: Summary report of core experiment on entropy coding [R. Joshi, E.
Alshina, H. Sasai, H. Kirchhoffer, J. Lainema (CE coordinators)]
Subtest A
a) Delayed probability update (576, 349)
Proponent Description BD-rate BD-rate, BD-rate,
(Y) (U) (V)
Qualcomm Delay 1 bin (all syntax elements) 0,1% 0,1% 0,2%
JCTVC- Delay 2 bins (all syntax elements) 0,6% 0,4% 0,5%
G576 Delay 3 bins (all syntax elements) 1,0% 0,4% 0,7%

Panasonic Delay all coefficient coding parameters until end of 0,2% 0,1% 0,1%
JCTVC- block
G349 Delay all coefficient coding parameters except for 0,1% -0,1% -0,1%
significant_coeff_flag" parameters until end of block
Delay "last_significant_coeff_x" and 0,0% -0,1% -0,1%
"last_significant_coeff_y" parameters until end of block
Delay "coeff_abs_level_greater1_flag"and 0,1% 0,0% 0,0%
"coeff_abs_level_greater2_flag" parameters until end of
block
Delay "significant_coeff_flag" parameters until end of 0,2% 0,1% 0,2%
block
Comment by one expert: It should be observed if the delayed update affects the probability
model and estimation (may be implementation specific and not be critical for the 1 bin delay
case).
Another comment: In hardware, delaying the update may not help to increase the throughput.
Particularly, delaying more than one bin produces unacceptable losses.
Revisited after presentation of other contributions that target increase of CABAC throughput.
For the one bin delay case, some doubt is expressed by other experts that it would help
increasing the throughput. G349 (updating probabilities of some syntax elements at the end of
TC block) is an even less systematic approach and decrease the regularity. No action.
b) Line buffer reduction (200, 769)
Proponent Description BD-rate, Y BD-rate, U BD-rate, V
MediaTek Split LCU
0,0% 0,0% -0,1%
JCTVC-G200
Skip LCU
0,0% 0,0% 0,0%
Spli&Skip LCU
0,0% 0,0% -0,1%
Samsung Split CU
JCTVC-G769 0,1% 0,0% 0,0%
Note: G200 uses 3 additional context models (6 instead of 3). In original contribution F060
Side activity to suggest common solution (proponents of G200, G769 and V. Sze and T.Nguyen)

Page: 19 Date Saved: 2011-12-04


JCTVC-G1022 On line buffer removal for CU split flag and skip flag context model [T.
Lee, J. Chen, J. Park (Samsung), T.-D. Chuang, C.-Y. Chen, Y.-W. Huang, S.
Lei (MediaTek), W.-J. Chien, M. Karczewicz (Qualcomm)] [late]
This document reports the results of line buffer removal in context models. By disabling dependency on
upper blocks on split flag and skip flag, the line buffer is totally removed in the current HM model. In this
contribution which is based on G769, G200 and G829, it is proposed to use the depth information of the
current block and the left block for the context formation of split_coding_unit_flag and to use the data of
the left block for the context formation of skip_flag when the upper block belongs to the upper LCU. It is
reported that the proposed context modeling for split flag and split flag causes 0.01%, 0.05, 0.12% BD
rate loss for, respectively, all intra, random access and low delay configuration.
Some concern was expressed whether the loss in LD B is too high. It is mentioned that the loss in
LD B is mainly from class E (0.31%) and class B (0.11%).
The loss might be higher when smaller LCU size was used.
The benefit by saving the line buffer is small
The context models are re-designed and were trained (which is not the case for the current HM).
It is reported that this training gave approx. 0.02% gain
Several experts express concerns whether this is a good trade-off between complexity reduction
and loss. No action.

Subtest B (764)
Summary of tests results (single parameter probability update model):
All Intra HE Random Access HE Low delay B HE
  Y U V Y U V Y U V
Class A -0.8% 0.0% 0.0% -0.8% 0.8% 1.5%      

Class B -0.6% -0.7% -0.6% -0.5% -0.2% 0.0% -0.4% 0.2% 0.0%

Class C -0.6% -0.9% -0.8% -0.4% -0.6% -0.3% -0.4% -0.3% -0.5%

Class D -0.7% -1.5% -1.5% -0.4% -0.6% -1.1% -0.4% -0.4% -0.6%

Class E -0.6% -0.8% -0.9%       -0.1% -1.5% -0.6%

Overall -0.7% -0.8% -0.8% -0.6% -0.2% 0.0% -0.3% -0.4% -0.4%

Enc Time[%] 101% 101% 100%

Dec Time[%] 101% 100% 100%

Summary of tests results (multi-parameter probability update model):


All Intra HE Random Access HE Low delay B HE
  Y U V Y U V Y U V
Class A -1.1% -0.7% -0.7% -1.2% -0.2% -0.3%      

Class B -0.9% -1.1% -1.2% -1.0% -0.7% -0.7% -0.8% -0.8% -1.1%

Class C -0.9% -1.2% -1.2% -0.9% -0.9% -0.9% -0.8% -0.7% -1.2%

Class D -0.9% -1.7% -1.7% -0.9% -1.1% -1.3% -0.8% -1.0% -1.0%

Class E -0.9% -1.8% -1.8%       -0.9% -2.0% -0.4%

Page: 20 Date Saved: 2011-12-04


Overall -0.9% -1.3% -1.3% -1.0% -0.7% -0.8% -0.8% -1.1% -0.9%

Enc Time[%] 103% 103% 102%

Dec Time[%] 101% 101% 102%

Note: G326, G413, G547 propose similar approaches but most likely in better implementation.

Subset C
Test results summary (BD-rate difference is averaged across AI, RA, and LD, class F is not included):
Description HE LC
Test Proponent BD-rate, BD-rate, BD-rate, BD-rate,
(Y)
(Y) (U+V)/2 (U+V)/2
BAC, -5.99% +4.47%
LC configuration
1 HHI BAC, LCmod, -0.45% +8.78%
(JCTVC- 8-bit init
G633)
2 HHI V2V, LCmod, Multi- -0.45% +7.80%
(JCTVC- bin,
G633) 8-bit init, TBC
3 HHI V2V, LCmod, Multi- -0.14% +8.13%
(JCTVC- bin,
G633) Low delay, 8-bit init,
TBC
4 HHI V2V, 8-bit init, +0.14% -0.74% -5.85% +3.78%
(JCTVC- TBC
G633)
5 HHI V2V, LowDelay, 8-bit 0.32% -0.58%
(JCTVC- init, TBC
G633)
6 HHI BAC, -0.24% -0.18% -6.20% 4.45%
(JCTVC- 8-bit init
G633)
7 HHI BAC, 8-bit init, -0.79% -0.58%
(JCTVC- Alt. PMU, TBC
G633)
8 Mitsubishi V2F, -3.27% 7.44%
(JCTVC- 8-bit init
G458)
9 Mitsubishi V2F, LowDelay, -3.27% 7.44%
(JCTVC- 8-bit init
G458)
10 Mitsubishi V2F, MC mod., -3.04% 9.48%
(JCTVC- LowDelay, 8-bit init
G458)
11 Samsung/ V2V, 8-bit init,
HHI Alt. PMU, TBC
(JCTVC-
G771)
12 Cisco BAC, RDOQ off, -5.20% -0.16%
(JCTVC- RDO PMU off
G210)
13 Cisco BAC, RDOQ off 5.99% 4.63%

Page: 21 Date Saved: 2011-12-04


(JCTVC- PMU off
G233)
14 HHI BAC, LCmod, 8-bit -0.31% +2.63
(JCTVC- init,
G633) RDOQ off
15 HHI V2V, LCmod, 8-bit -0.27% +1.97
(JCTVC- init,
G633) Multi-bin, RDOQ off
16 withdrawn
17 Docomo BAC, 0.0% 0.0% -5.96% +4.52%
(JCTVC- TBC
G763)
Note: G837, G155 also suggest new 8 bit initialization – further investigation.
Test 1 is using HM replacing CAVLC vs. CABAC in LC
Not clear: Is test 13 results relative to test 1?

Test V2F V2V BAC LC MC PMU Multi Low 8-bit Alt. RDOQ RDO TBC Config.
mod. mod. off bin delay Init PMU off PMU off tested
1 x x n/a x LC
2 x x x x x LC
3 x x x x x x LC
4 x x x HE, LC
5 x x x x HE
6 x n/a x HE, LC
7 x n/a x x x HE
8 x x LC
9 x x x LC
10 x x x x LC
11 x x x x HE
12 x x x LC
13 x x x LC
14 x x n/a x x LC
15 x x x x x x LC
16 x x x x x LC
17 x x HE,
LC

Conclusions of subtest C:
- No further consideration on V2V and V2F (see below)
- One entropy coder? (looking at tests 12 and 13): Whereas 12 is encoder only and loses only
slightly compared to the test 1 case, 13 also changes the decoder.
Two experts mention that runtime is not the only issue, also consider throughput which may be
problem with CABAC (G569 addresses this issue by having “not two but 1.2” entropy coders).

Page: 22 Date Saved: 2011-12-04


Decision: Only one entropy coder (current CABAC as start, may be further optimized in terms of
throughput, performance etc.) (Reported to plenary for confirmation Tue morning)

4.1.2 Contributions

Subtest A

JCTVC-G200 CE1.A.3: Reducing line buffers for CABAC [T.-D. Chuang, C.-Y. Chen, Y.-
W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE1.A.3. In HM-4.0, split_coding_unit_flag and skip_flag are the
only two CABAC syntax elements that still have dependency on upper LCUs and need line buffers. In
this contribution, it is proposed to use the depth information of the current block and the left block for the
context formation of split_coding_unit_flag and to use the data of the left block for the context formation
of skip_flag when the upper block belongs to the upper LCU. In this way, all CABAC line buffers can be
removed. It is reported that the proposed context modeling causes less than 0.08% bit rate increase.

JCTVC-G298 CE1: Cross-check report for CE1 Subset A MediaTek's proposal on


Reducing Line Buffers for CABAC [H. Sasai, T. Nishi (Panasonic)]

JCTVC-G349 CE1: SubsetA: Parallel context processing for coefficient coding using block-
based context updates [H. Sasai, T. Nishi (Panasonic)]
This contribution is a test report for JCTVC-E226 listed in CE1 Subset A. Proposed technique is aimed to
improve the throughput of the entropy coder for CABAC. The context updates make difficult to increase
throughput due to their serial dependencies. The proposed modifications have been implemented in
HMv4 and their coding efficiencies were evaluated for coefficient cording parameters
"last_significant_coeff_x","last_significant_coeff_y","significant_coeff_flag","coeff_abs_level_greater1_
flag"and"coeff_abs_level_greater2_flag" respectively. The parallelization capability by the proposal
comes at a cost of less than 0.1% performance loss.

JCTVC-G471 CE1.A.2: Crosscheck for Panasonic's parallel context processing for


coefficient coding using block-based context updates in JCTVC-G349 [T.-D.
Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G576 CE1: Delayed state update for CABAC [R. Joshi, J. Sole, M. Karczewicz
(Qualcomm)]
Context update is a known bottleneck in hardware implementation of a CABAC decoder. This is because
the state of the context is updated based on the decoded bin. Previous efforts to mitigate this problem
have included delaying the context update by one during transform coefficient coding and delaying the
update till the end of a TU. In this proposal we extend the state update delay to bins from all syntax
elements. Results are presented for update delays of 1, 2 and 3 bins. For update delays of 1, 2, and 3 bins,
average BD-rates of 0.1%, 0.6% and 0.9%, respectively, are reported for HE configurations.

JCTVC-G822 CE1: Cross-check for delayed probability up-date from Qualcomm by


Samsung [E. Alshina, J.H. Park] [late]

Page: 23 Date Saved: 2011-12-04


JCTVC-G769 CE1 subtest A: Line buffer removal for CU split flag context model [T. Lee,
J. Chen, J. H. Park (Samsung)]
This document reports the results of “Line buffer removal for CU split flag context model” method
proposed in document JCTVC-F497 within the context of CE1. In this proposed method, context model
split_coding_unit_flag is selected based on value of left block and coding unit size to avoid access the
context of Context of above block. Experiments show that 0.01%, 0.03, 0.11% BD rate loss is observed
for, respectively, all intra, random access and low delay configuration by removing the line buffer for
split_coding_unit_flag.

JCTVC-G472 CE1.A.4: Crosscheck for Samsung's line buffer removal for CU split flag
context model in JCTVC-G769 [T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G763 CE1: Table-based bit estimation for CABAC [F. Bossen (DOCOMO
Innovations)]
In the RDO process of the HM, a block may be encoded multiple times using different modes before a
best mode is selected based on a rate-distortion criterion. When using CABAC, each of these encodings
use the CABAC engine itself to count a number of bits. In this experiment the bit counting procedure is
simplified wherein bit counts are estimated using tables. This simplification does not impact rate-
distortion performance (all recorded luma BD-rate averages are 0.0%) while reducing the encoding time
by 1 to 5%.
Non-normative improvement – Decision (SW): Adopt.

Subtest B

JCTVC-G764 CE1 (subset B): Multi-parameter probability up-date for CABAC


[Alexander Alshin, Elena Alshina, JeongHoon Park (Samsung)]
In the RDO process of the HM, a block may be encoded multiple times using different modes before a
best mode is selected based on a rate-distortion criterion. When using CABAC, each of these encodings
use the CABAC engine itself to count a number of bits. In this experiment the bit counting procedure is
simplified wherein bit counts are estimated using tables. This simplification does not impact rate-
distortion performance (all recorded luma BD-rate averages are 0.0%) while reducing the encoding time
by 1 to 5%.

JCTVC-G553 CE1: Cross-check of Samsung’s Multi-parameter probability model up-date


for CABAC (JCTVC-F254) [J. Stegemann (HHI)]

Subtest C

JCTVC-G210 CE1: Subtest 12, Entropy coding comparisons with simplified RDO [T.
Davies (Cisco)]
Various coding conditions are simulated with both CAVLC and CABAC. Under low-complexity (LC)
common conditions it is reported that CABAC provides gain between 5.4% and 6.2% (6.8% including
class F). The performance of the entropy coders is also investigated with two encoder restrictions: no
RDOQ, and no adaption during RDO mode search. These assumptions together reduce the gap between
CABAC and CAVLC by 0.775% (0.875% including class F) averaged across LC settings. When RDOQ
is off and RDO adaption is off the gap is reported to be between 5.0% and 5.5% for LC settings (4.9%
and 5.9% including class F).

Page: 24 Date Saved: 2011-12-04


JCTVC-G233 CE1: Subtest 13, Disabling CABAC PMU [T. Davies (Cisco)]
This contributions presents the results of Subtest 13 of Core Experiment 1 on disabling PMU in CABAC.
With RDOQ on, BD-Rate differences of 2.8%-6.2% are observed. With RDOQ off, BD-Rate differences
of 4.0%-7.3% are observed. It is noted that BD-Rate differences are very much higher for Class F
sequences, and that this is likely due to unsuitable initialization tables.

JCTVC-G765 CE1: Crosscheck of Subtest C - Tests 12 and 13 (JCTVC-G210 and JCTVC-


G233) [R. Joshi (Qualcomm)]

JCTVC-G458 CE1c: results of PIPE/V2F [K. Sugimoto, R. Hattori, A. Minezawa, S.


Sekiguchi (Mitsubishi)]
This contribution reports verification results on tests 8, 9, and 10 in CE1 Subtest C that evaluates
PIPE/V2F. The verification work was performed using the software implemented PIPE/V2F on top of
HM-4.0. Reported performance results were obtained by building and running the provided software on
64bit Linux platform. It is reported that overall BD-rate gain compared to CAVLC is 2.9%, 3.8%, 3.2%
and 3.2% for AI-LC, RA-LC, LB-LC and LP-LC settings respectively with around 5% and 1% encoder
and decoder run-time increase. It is also reported that Low delay tool for PIPE/V2F does not affect coding
performance and run-time. It is reported that medium complexity configuration for context model
derivation gives around 0.3% bit increase with 1% decoding run-time improvement.
Note: Chroma BD rates are 10-15% worse
Request to replace CAVLC by this and keep CABAC
No support by other companies.

JCTVC-G435 CE1 subtest C: Cross-checking reports of PIPE/V2F for low complexity


entropy coding (JCTVC-F176) [A. Tanizawa, T. Shiodera, T. Yamakage
(Toshiba)]

JCTVC-G459 CE1c: crosscheck of PIPE/V2V, CABAC [K. Sugimoto, R. Hattori, A.


Minezawa, S. Sekiguchi (Mitsubishi)]

JCTVC-G633 CE1: Report of test results related to PIPE-based Unified Entropy Coding
[Heiner Kirchhoffer, Benjamin Bross, Anastasia Henkel, Detlev Marpe,
Tung Nguyen, Matthias Preiß, Mischa Siekmann, Jan Stegemann, Thomas
Wiegand (Fraunhofer HHI)]
This contribution reports results for tests related to the unified PIPE-based entropy coding using v2v
codes in CE1. Various combinations of the tools PIPE/v2v, BAC, LC modeling, low delay, 8-bit init, and
table-based bit counting were analyzed to identify the influence of the tools on BD rate and codec
runtime. Furthermore, hardware implementation aspects are analyzed for PIPE/v2v with low delay
constraint.
New concept of “chunk interleaving” is presented (not in any proposal before) which solves some of the
multiplexing and low delay issues (for pre-defined number of parallel encoders/decoders).
No analysis about concrete statistics of bins that can be processed in parallel. Main limitation may come
from the parser.

Page: 25 Date Saved: 2011-12-04


No support by other companies.

JCTVC-G753 CE1: Crosscheck of CE1 subtest C.6 (JCTVC-G633) proposed by HHI [X.
Zheng (HiSilicon)] [late]

JCTVC-G932 CE1: Cross-check for bug-fix version of test 12 in subset C from HHI
(JCTVC-G633) by Samsung [E. Alshina, J.H. Park (Samsung)] [late]

JCTVC-G072 CE1: Cross-check of Subtest C (Tests 1, 2, 4, 6, 17) [C. Yeo (I2R)]

JCTVC-G300 CE1: Cross-check report for CE1 Subset C Test 7 [H. Sasai, T. Nishi
(Panasonic)]

JCTVC-G612 CE1: Cross-check of test 9: Mitsubishi’s study on PIPE/V2F with LowDelay


buffer control [M. Siekmann (Fraunhofer HHI)]

JCTVC-G641 CE1: Cross-check of Subtest C (test 1 and test 11) and Subtest B - multi-
parameter probability update for CABAC (test 11) [Jinwen Zan, Dake He]
[late]

JCTVC-G766 CE1: Crosscheck of Subtest C - Test 5 (JCTVC-G633) [R. Joshi


(Qualcomm)]

JCTVC-G771 CE1 (subset C, test 11): Multi-parameter probability up-date for PIPE [A.
Alshin, E. Alshina, J.H. Park (Samsung), H. Kirchhoffer (HHI)]
In this contribution, an multi-parameter probability up-date technique is proposed for relatively new
entropy coding scheme PIPE. It should be noted that probability up-date for PIPE coincides with CABAC
probability up-date. Proposed method allows more precise probability estimation for current bin which
means that more accurate distribution between different bin-encoders becomes possible. In terms of
coding efficiency, the presented probability up-date technique on top of PIPE shows an average BD rate
gain about 0.4 % in HE configuration.

Page: 26 Date Saved: 2011-12-04


4.1.3 Discussion and Conclusions

4.2 CE2: Motion partitioning and OBMC

4.2.1 Summary

JCTVC-G032 CE2: Summary report of core experiment on Motion Partitioning and


OBMC [X. Zheng, I. S Chong, I.-K Kim]
This document summarizes the activities of CE2 related to Motion Partitioning and OBMC [1]. The
description of the experiment can be found in JCTVC-F902. In this CE, six subtests have been studied
and evaluated. Among those subtests, proponent of subtest B.1 and C.1 have released software before
13th September, and other subtests have been withdrawn.

4.2.2 Contributions

JCTVC-G517 CE2 subtest C.1: Harmonization of unified scan and NSQT [X. Zheng
(HiSilicon), Y. Yuan, Y. He (Tsinghua)]
This document provides a harmonization solution of unified scan and non-square quadtree transform
(NSQT). At the proposed solution, non-square to square reordering process for transform coefficient
coding is removed. The experimental results show that no coding lost under common test condition.
Encoding and decoding time for the proposed solution is almost same as HM4.0 anchor.
Consists of two parts: Harmonization of scans and solution for the divergence between SW and WD.
See also transform coefficient coding section.

JCTVC-G518 CE2 subtest C.1: Non-square quadtree (NSQT) with 2x8 and 8x2 transform
[Y. Yuan (Tsinghua), X. Zheng (HiSilicon), Y. He (Tsinghua)]
This document provides the results of non-square quadtree (NSQT) with 2x8 and 8x2 transform. At the
proposed method, a 2x2 Hadamard-like transform is added to HEVC framework. The tests under
common test condition show that the average gain of 0.0% for RA, 0.0% for RA_LC, 0.1% for LD_B,
0.1% for LD_B_LC, 0.1% for LD_P and 0.2% for LD_P_LC can be achieved. Combine with non-square
hadamard transform, the average gain of 0.1% for RA, 0.2% for RA_LC, 0.3% for LD_B, 0.4% for
LD_B_LC, 0.3% for LD_P and 0.4% for LD_P_LC can be achieved. Compare to HM4.0 anchor,
encoding and decoding time are almost same as before.
2x8 and 8x2 were rejected last time, could be problematic in terms of memory access. Gain is relatively
small (0.1% without class F). G521 is an improvement by the proponents.

JCTVC-G571 CE2: Cross-check of Non-Square Quadtree Transform (NSQT) JCTVC-


G517 & JCTVC-G518 [S. Oudin, B. Bross (Fraunhofer HHI)]
Criticism by cross-checker: Unclear description of NSQT in general and unified scan, bad
implementation of software (unused variables)
Clarify: Is this rather a problem of software implementation or WD text concept? Not fully clear, may be
both
 It may not have been fully clear what was adopted last time
 This came together with AMP, which also has some problems in SW implementation (see AHG3
report)

Page: 27 Date Saved: 2011-12-04


 A software bug was reported and a fix was provided that seemed to contain some new elements
(as documented in G517). Another fix is announced by the proponents to be available shortly
 There are other proposals (by original proponents and other companies) which eventually are
using better concepts for signalling/partitioning the NS blocks – revisit G517 was discussed again
in that context [That’s Track B].

JCTVC-G749 CE2: Overlapped Block Motion Compensation [L. Guo, I.S. Chong, X.
Wang, M. Karczewicz (Qualcomm)]
In this contribution, overlapped block motion compensation (OBMC) has been implemented and tested
on HM 4.0. OBMC is applied to 2NxN, Nx2N and AMP motion partitions. To limit the worst case
memory bandwidth of OBMC, the fetching of extra pixels is disabled for bi-prediction PUs in 8x8 CU.
The method achieved a BD-rate reduction of 0.6%, 0.9% and 2.0% on average for RA-HE, LD-HE and
LDP-HE respectively. For LC test, the average BD-rate reduction 0.6%, 0.8% and 2.0% for RA-LC, LD-
LC and LDP-LC respectively.
Additional memory access may not be an issue (also confirmed by the cross-checkers), but averaging
operation reduces computational throughput and adds complexity at decoder.
One expert mentions that a disabling flag should be implemented.
Three companies express negative opinions. No support by other companies. Consider discontinuation.

JCTVC-G432 CE2: Cross verification of Qualcomm's Overlapped Block Motion


Compensation (JCTVC-G749) [S. Lee, S. Cho, N. Eum (ETRI)]

JCTVC-G939 Non-CE2: Crosscheck for Qualcomm's overlapped block motion


compensation in JCTVC-G749 [C.-W. Hsu, Y.-W. Huang (MediaTek)] [late]

4.2.3 Discussion and conclusions

4.3 CE3: Motion compensation

4.3.1 Summary

JCTVC-G033 CE3: Summary report of Core Experiment on Motion Compensation


[T.Chujoh, E.Alshina (CE coordinators)]

The goal of this Core Experiment (CE) is to further investigate following aspects of motion compensation
in HM:
 Simplification of interpolation MC and reduction of reference frame memory access bandwidth;
 Improve the trade-off between coding performance and complexity by MC optimization;
 Study complexity in terms for computations number and memory band-width with actual hit-ratio
measurement ;

Page: 28 Date Saved: 2011-12-04


 Study possible reduction of the worst case of memory band-width;
 Study accuracy of MV resolution for HM.
Fixed filters (G775, G696, G778, G427, G698, G699, G058, G062, G697): G697 (0.55% gain) gives best
results which is an extension of G778. For fixed filters, the problem of ripples (both in spatial and
frequency) shall be solved which is a problem in case of the current interpolation filters particularly when
SAO is off. It is said that other (e.g. shorter) filters investigated also have some problems with this regard.
General agreement from a design point of view that it would be desirable to keep switching between
different filters as low as possible (except for the subpel positions), as it would complicate motion
estimation and compensation.
Adaptive resolution 1/2 | 1/4 | 1/8 (G535, G277, G727) gives mainly gain for LD P case and there only for
some sequences, but there are losses in other cases. G277 is the best proposal in this category.
Performance seems to be rather non-uniform over different sequences. Also no doubt that this increases
complexity (additional 1/8-pel filters, loading of coefficients). No support except by proponents.
AIF (G258, G259, G057, G063) mainly shows benefit when SAO is off (on average not more than
0.2..0.3% when SAO is on). Filter coefficients need to be changed/reloaded eventually per PU – clearly
increased complexity. No support except by proponents.
Memory bandwidth reduction (G390, G391, G392, G770, G780): G780 could be implemented in a “non-
normative” way if it was only applied for luma. G392c is also “non-normative”. The memory bandwidth
is said to be increased (see F487) compared to AVC (as luma is 8 instead of 6 tap filters which increases
40%, and chroma is 4 instead of 2 tap which increases by factor of 3). It is agreed that the memory
bandwidth of motion comp is approx. 60+% higher than for AVC (may need more study).
Note: “Non-normative” refers to decoder, but may need normative restrictions on the bitstream.
It is agreed that memory bandwidth of motion comp is a problem particularly for 4K and beyond. It
would however be desirable not to modify tools for this purpose but rather use methods of level
restrictions, for example restrict usage of small PUs, restrict usage of 2D interpolation, restrict subpel
accuracy etc. Establish AHG on this issue.

Follow-up discussion in plenary Wed morning:


 It is confirmed that using different tools at different levels would not be desirable
 How does this relate with the increased capability of parallelism in HEVC? Is it necessary at all?
Several experts argue that this is the case, as also multi-core processors often need to access the
same memory.
 Out of a similar purpose, 4x4 PUs were removed during the last meeting. Was this justified? (it
was answered that this only produced bitrate increase of 0.1-0.2% average, which was acceptable
to keep the consistency of the design over all levels)
 It is also mentioned by S. Wenger that it may be necessary to re-consider definition of levels
under the paradigm of increased parallelism

4.3.2 Contributions

JCTVC-G057 CE3: An Adaptive Interpolation Filtering Technique [F. Kossentini, N.


Mahdi, H. Guermazi, M. Horowitz (eBrisk Video Inc.)]
In this contribution, an Adaptive Interpolation Filtering (AIF) technique is proposed for interpolation
filtering of the luminance samples of video sequences. For each video picture, the encoder first generates
two new one-dimensional 8-tap filters: one ½-pel filter and another (¼-pel, ¾-pel) filter, also called a ¼-
pel filter. Then, the encoder decides, for each reference picture, whether to use the HM 4.0 [1] default ½-

Page: 29 Date Saved: 2011-12-04


pel filter or the new ½-pel filter, and whether to use the HM 4.0 [1] default ¼-pel filter or the new ¼-pel
filter. The encoder then sends the resulting two-bit value of the luma_filter_mode field to the decoder.
Compared to HM4.0, the proposed technique yields average BD-rate reductions of 2.5% for LDP/LC,
0.5% for LDP/HE, 0.8% for LD/LC, 0.0% for LD/HE, 0.4% for RA/LC and 0.0% for RA/HE, while
maintaining the same average decoding times. The encoding time, however, increases by an average of
4%.

JCTVC-G395 CE3: Cross check for eBrisk's proposal JCTVC-G057 (tool 6) [K. Kondo, T.
Suzuki (Sony)]

JCTVC-G058 CE3: Interpolation using different-length horizontal and vertical filters [F.
Kossentini, N. Mahdi, H. Guermazi, M. Horowitz (eBrisk Video Inc.)]
In this contribution, an interpolation filtering technique is proposed for the motion compensation
interpolation filtering of the luminance and chrominance samples of video sequences. This proposal
consists of using one set of fixed vertical filters (V_F) for the vertical stage of filtering and a second set of
fixed horizontal filters (H_F) for horizontal stage of filtering. Compared to HM4.0, the proposed
technique yields average BD-rate reductions of -0.7% for LDP/LC, -0.2% for LDP/HE, -0.1% for LD/LC,
0.0% for LD/HE, 0.2% for RA/LC and 0.0% for RA/HE, while decreasing the decoding complexity. In
fact, this technique reduces the required number of multiplications and additions by 5% and 6%,
respectively.

JCTVC-G063 CE3: Adaptive interpolation using different-length horizontal and vertical


filters [F. Kossentini, N. Mahdi, H. Guermazi, M. Horowitz (eBrisk Video
Inc.)]
In this contribution, an Adaptive Interpolation Filtering (AIF) technique is proposed for interpolation
filtering of the luminance samples of video sequences. This proposal consists of using one set of vertical
filters (V_F1/2 and V_F1/4) for the vertical stage of filtering and a second set of horizontal filters
(H_F1/2 and H_F1/4) for horizontal stage of filtering. Moreover, each one of the used filters may be a
default fixed or a newly-generated filter and where the encoder first generates four new one-dimensional
filters: one ½-pel (6-tap) filter for vertical filtering stage, one ½-pel (8-tap) filter for horizontal filtering
stage, one (¼-pel, ¾-pel) (7-tap) filter for vertical filtering stage and another (¼-pel, ¾-pel) (7-tap) filter
for horizontal filtering stage. Then, the encoder decides, for each reference picture, and for each filtering
direction (horizontal or vertical) whether to use the corresponding default ½-pel filter or the new ½-pel
filter, and whether to use the corresponding default ¼-pel filter or the new ¼-pel filter. The encoder then
sends the resulting two-bit value of the luma_filter_mode field to the decoder.
Compared to HM4.0, the second proposed technique yields average BD-rate reductions of -2.7% for
LDP/LC, -0.4% for LDP/HE, -0.5% for LD/LC, 0.3% for LD/HE, -0.4% for RA/LC and -0.1% for
RA/HE, while decreasing the decoding complexity. In fact, this technique reduces the required number of
multiplications and additions by 5% and 6%, respectively.

JCTVC-G821 CE3: Cross-check for eBrisk proposals on interpolation MC by Samsung.


[E. Alshina, J.H. Park] [late]

Page: 30 Date Saved: 2011-12-04


JCTVC-G258 CE3: LCU-based adaptive interpolation filter [S. Matsuo, S. Takamura, H.
Jozawa (NTT)]
This document reports the performance of an LCU-based adaptive interpolation filter that adjusts filter
coefficients based on the characteristic of the input video signal. In the proposed method, a frame is
segmented into multiple regions, and filter coefficients for each region are derived. The basic idea is that
filter coefficients are designed for each region in a frame when the frame consists of multiple regions with
different characteristics. The proposal was implemented in HM4.0 software to evaluate its performance.
Compared to the HM4.0 anchor, the overall average coding gain was about 0.52%. The detailed coding
gains for RA-HE, LB-HE, LP-HE, RA-LC, LB-LC and LP-LC were 0.05 %, 0.02 %, 0.1%, 0.4%, 0.5%
and 2.1%, respectively. The maximum coding gain was about 10.0% for the sequence “Vidyo3” in LP-LC
case. The computational complexity at the encoder and decoder were 100.93% and 101.69% on average,
respectively.

JCTVC-G379 CE3:Cross-check report of LCU-based adaptive interpolation filter


(JCTVC-G258) [T. Yoshino, S. Naito (KDDI)]

JCTVC-G277 CE3: Progressive Motion Vector Resolution [J. An, X. Li, X. Guo, S. Lei
(MediaTek)]
In JCTVC-F125, a progressive MV resolution (PMVR) method was proposed, which uses higher MV
resolution near to MV predictor (MVP) and lower MV resolution far from MVP. Thresholds for 1/4- and
1/8-pixel resolution were used to indicate the range of corresponding MV resolutions. This contribution
presents the results of PMVR method on top of HM4.0. For PMVR without 1/8-pixel resolution, it is
reported that by using different thresholds, average BD-Rate reduction of 0.2% can be achieved with
around 9% encoding time decrease for RA and LB cases, and average BD-Rate reduction of 0.1% can be
achieved with around 4% encoding time decrease for LP case. For PMVR with 1/8-pixel resolution, it is
reported that by using different thresholds, average BD-Rate reduction of 0.5% with around 8% encoding
time decrease can be achieved for RA and LB cases, and average BD-Rate reduction of 2.6% with 5%
encoding time increase can be achieved for LP case.

JCTVC-G178 CE3: Cross-check report for Tool 10 on Progressive MV Resolution [S.


Park, B. Jeon (LGE)]

JCTVC-G189 CE3: Cross-verification result of Tool 10 [K.Kazui (Fujitsu)]

JCTVC-G534 CE3: Cross-check report of Tool 10 (Progressive MV resolution) by


Panasonic [T. Sugio, T. Nishi(Panasonic)]

JCTVC-G390 CE3: MC boundary filter (tool 7) [K. Kondo, T. Suzuki (Sony)]


This contribution reports results of MC boundary filter (MBF) which is studied in core experiment (CE)
3. To reduce complexity both memory bandwidth (b/w) and computation, the MBF uses different filter
coefficients to MC block boundary. With the proposed method, the memory b/w can be reduced -44%
and -17% for worst and actual in common test condition. The computation of multiplications can be
reduced -28% and -9% for worst and actual case. The impact for coding efficiency are 0.3%, 0.2%, 0.3%,
0.1%, 0.3%, and 0.2% for RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC. The combination
MBF with a restriction of PU 8x4 and 4x8 bi-prediction is additionally tested. The worst memory b/w can

Page: 31 Date Saved: 2011-12-04


be reduced -58%. The impact for coding efficiency are 0.4%, 0.4%, 0.6%, 0.4%, 0.1%, and 0.1% for
RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC.

JCTVC-G428 CE3: Cross-check report for Sony's tool 7 (JCTVC-G390) [T.Chujoh


(Toshiba)]

JCTVC-G391 CE3: Tap length reduction for small block (tool 8) [K. Kondo, T. Suzuki
(Sony), K. Ugur (Nokia)]
This contribution reports results of MC boundary filter (MBF) which is studied in core experiment (CE)
3. To reduce complexity both memory bandwidth (b/w) and computation, the MBF uses different filter
coefficients to MC block boundary. With the proposed method, the memory b/w can be reduced -44%
and -17% for worst and actual in common test condition. The computation of multiplications can be
reduced -28% and -9% for worst and actual case. The impact for coding efficiency are 0.3%, 0.2%, 0.3%,
0.1%, 0.3%, and 0.2% for RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC. The combination
MBF with a restriction of PU 8x4 and 4x8 bi-prediction is additionally tested. The worst memory b/w can
be reduced -58%. The impact for coding efficiency are 0.4%, 0.4%, 0.6%, 0.4%, 0.1%, and 0.1% for
RA_HE, RA_LC, LD_HE, LD_LC, LDP_HE and LDP_LC.

JCTVC-G250 CE3: Cross-check of JCTVC-G391 on MC memory band-width reduction


(tool8) from Sony [E. François (Canon)] [late]

JCTVC-G427 CE3: Non-uniform tap length filtering (tool 2) [T.Chujoh, T.Yamakage


(Toshiba)]
An experimental result of non-uniform tap length filters is reported. This is one of proposals of CE3 on
motion compensation. The worst cases of complexity of interpolation process are two dimensional
quarter pixel positions. The shorter tap length filters are introduced in those cases. As an experimental
result, the gain coding efficiency is an average of 0.12%. The complexity analyses of 8/7-tap non-
uniform tap length filters show worst case of computational complexity and average of computational
complexity and memory bandwidth are reduced from them of 8-tap DCT-IF.

JCTVC-G393 CE3: Cross check for Toshiba's proposal JCTVC-G427 (tool 2) [K. Kondo,
T. Suzuki (Sony)]

JCTVC-G535 CE3: Experimental Result on Tool 9 (Picture Adaptive 1/8-pel Motion


Compensation Method) [T. Sugio, T. Nishi (Panasonic)]
This contribution is a report of CE3 experiment Tool 9 in JCTVC-F903. Experimental results reportedly
showed 2.3% BR saving for HE and 2.0% BR saving for LC on average in the LD P scenarios. It also
reportedly showed 0.2% BR saving for HE and 0.3% BR saving for LC in the RA scenarios, and 0.1%
BR saving for LC on average in the LD B scenarios relative to the HM4.0.

JCTVC-G137 CE3 : Cross-check of Panasonic’s Picture Adaptive 1/8-pel Motion


Compensation : Test2b (F472) [S. Park (ETRI)]

Page: 32 Date Saved: 2011-12-04


JCTVC-G672 CE3: Crosscheck for Panasonic's Picture Adaptive 1/8 MC in JCTVC-G535
[J. An, X. Guo (MediaTek)]

JCTVC-G696 CE3: Fixed interpolation filter tests by Motorola Mobility [J. Lou, K. Minoo,
D. Baylon, L. Wang, A. Luthra (Motorola Mobility)]
This document reports the results of Motorola Mobility’s interpolation filters for HEVC. The simulations
were conducted using HM4.0 software with Motorola Mobility’s modifications. Four sets of fixed
interpolation filters are tested. Compared with the current interpolation filter in HM4.0, the proposed 6-
tap half-pel filter with 7-tap quarter-pel filter scheme with 13/64 offset achieves 0.0%, 0.3%, 0.1%, 0.2%,
-0.3% and -0.4% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and LPLC settings; the
proposed 6-tap half-pel filter with 7-tap quarter-pel filter scheme with 3/16 offset achieves 0.4%, 0.9%,
0.3%, 0.5%, -0.4% and -1.2% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and LPLC
settings; the proposed 6-tap half-pel filter with 7-tap quarter-pel filter scheme with 15/64 offset achieves -
0.1%, 0.0%, 0.1%, 0.2%, 0.3% and -0.3% bitrate differences in RAHE, RALC, LBHE, LBLC, LPHE and
LPLC settings; the proposed 8-tap half-pel filter with 8-tap quarter-pel filter scheme with 3/16 offset
achieves -0.1%, -0.3%, 0.1%, -0.4%, -0.8% and -0.3% bitrate differences in RAHE, RALC, LBHE,
LBLC, LPHE and LPLC settings. Cross-check will be provided by Samsung. The attached spreadsheet
contains detailed data of the results.

JCTVC-G697 CE3: Joint sub-pixel interpolation filter tests for bi-predicted motion
compensation by Motorola Mobility [J. Lou, K. Minoo (Motorola Mobility)]

This document reports the results of Motorola Mobility’s Joint Sub-Pixel Interpolation Filters (JSPIF) for
bi-predicted motion compensation for HEVC. The simulations were conducted using HM4.0 software
with Motorola Mobility’s modifications. Three sets of 6-tap fixed interpolation filters for bi-predicted
motion compensation are used. For uni-prediction, 8H+7Q fixed filters with 3/16 offset are used.
Compared with the current interpolation filter in HM4.0, set0 without bug-fix achieves -0.3%, -0.3%, -
0.8% and -0.8% bitrate differences in RAHE, RALC, LBHE and LBLC settings; set1 without bug-fix
achieves -0.2%, -0.2%, -0.7% and -0.7% bitrate differences in RAHE, RALC, LBHE and LBLC settings;
set2 without bug-fix achieves -0.4%, -0.4%, -0.8% and -0.7% bitrate differences in RAHE, RALC, LBHE
and LBLC settings; set2 with bug-fix achieves -0.4%, -0.3%, -0.8% and -0.7% bitrate differences in
RAHE, RALC, LBHE and LBLC settings; All the three sets achieve -0.8% and -1.3% bitrate differences
in LPHE and LPLC settings.
Was presented.
Some concern about additional complexity particularly for hardware. Depending on implementation, cost
could be roughly 2x chip size for MC, depending on implementation.
No support except by proponents.

JCTVC-G820 CE3: Cross-check for Motorola Mobility proposals on interpolation MC by


Samsung. [E. Alshina] [late]
In total 11 different filters. Estimate that this is 4x more complex. Might not be a problem for software.
Gives most gain for BQSquare

JCTVC-G727 CE3: Adaptive resolution on motion vector difference [W.-J. Chien, X.


Wang, M. Karczewicz]
This contribution presents a coding method to signal the resolution of the motion vector difference. The
motion accuracy of the motion vector difference can be adaptively selected to be two different accuracy

Page: 33 Date Saved: 2011-12-04


and signaled via a motion resolution flag. A joint coding of motion resolution flag and motion vector
difference are also proposed. Two adaptations, half-pel/quarter pixel and quarter pixel/one-eighth pixel,
are tested. Experimental results reportedly show 0.2℅ and 0.3% BD-rate saving on the adaptation of
half-pel/quarter pixel and quarter pixel/one-eighth pixel, respectively. 2.5% BD-rate saving is observed
for low delay P configuration with the adaptation of quarter pixel/one-eighth pixel.

JCTVC-G671 CE3: Crosscheck for Qualcomm's Adaptive MVD Resolution in JCTVC-


G727 [J. An, X. Guo (MediaTek)]

JCTVC-G342 CE3: Cross-verification of Qualcomm’s adaptive motion vector resolution


(JCTVC-G727) by Intel [Y. Chiu, W. Zhang, L. Xu, Y. Han (Intel)]

JCTVC-G775 CE3: 7Q6H taps interpolation filters test by Samsung [E. Alshina, A. Alshin,
J.H. Park (Samsumg)]
This is CE response from Samsung. The following combination: 7 taps quarter-pel and 6 taps half-pel
interpolation filter was tested. In average across 6 test cases RA-HE/LC, LD-HE/LC and LD(P) –HE/LC
this combination provides 0.07%(Y) -0.11% (U) -0.19% (V) BD-rate change (drop in Luma and gain for
Chroma). Computation complexity of MC for this filter approaches to AVC interpolation filter which is
13% and 16% less in terms of number of mults and adds compare to HM4.0

JCTVC-G394 CE3: Cross check for Samsung's proposal (tool 1) [K. Kondo, T. Suzuki
(Sony)]

JCTVC-G695 CE3: Cross-check report for Samsung’s interpolation filter 6H+7Q


(JCTVC-G775) [J. Lou, K. Minoo, L. Wang (Motorola Mobility)]

JCTVC-G778 CE3: 7 taps interpolation filters for quarter pel position MC from Samsung
and Motorola Mobility [E. Alshina, A. Alshin, J.-H. Park, (Samsung), J. Lou,
K. Minoo, (Motorola Mobility)]
Two variants of 7 tap interpolation filters for quarter-pel position are tested here. Both resolves visual
artifacts problem in LD(P)-LC, SAO off test. Both show performance improvement compare to HM4.0.
An average BD-rate for Y/U/V components across 6 test cases required in CE3: -0.1%/ -0.1%/-0.2% for
variant A and -0.4%/0.0%/0.0% for variant B. Two variants of proposed 7 taps filters use different phase
shift (1/4 for variant A and 3/16 for variant B) which results in the same worst case computational
complexity and memory access while different hit-ration for fractional position in MC process and so
different statistical computation complexity was observed. For variant A the number of computations
according to CE3 measure is 5-6% smaller compare to HM4.0. Variant B shows 1-2% higher
computational complexity compare to HM4.0
Was presented. Gain for LD P only: 1/4 HE 0.1 LC 0.5; 3/16 HE 0.8 LC 1.3.
It needs to be clarified whether there is visual advantage for ALF/SAO off and LD P, and no disadvantage
for other cases. It was suggested to oOrganize a viewing session and revisit discuss the subject further
after that

Page: 34 Date Saved: 2011-12-04


[A Report about the test was included in JCTVC-G033.]
A tTest was performed at QP37 (otherwise with smaller QP, the artifacts would probably not have been
visible).
Decision: Adopt Filter A

JCTVC-G059 CE3: Cross-Verification of Samsung/Motorola’s interpolation filter variant


B (JCTVC-G778) [F. Kossentini, N. Mahdi (eBrisk Video Inc.)]

JCTVC-G636 CE3: Cross-Verification of Samsung/Motorola’s interpolation filter variant


A [K. Ugur, O. Bici (Nokia)] [late]

JCTVC-G780 CE3: The worst case memory band-width reduction by 2D->1D


interpolation replacement (from Samsung). [E. Alshina, A. Alshin, J. Chen,
J.H. Park (Samsung)]
High memory access during motion compensation is the main drawback for both bi-prediction and long-
taps interpolation filters. In this contribution motion compensation process in bi-prediction is modified to
reduce memory access: a motion vector which requires high memory access is replaced by nearest one
with acceptable level for memory access.

4.3.3 Discussion and conclusions


Discussion: Would it be desirable to define additional testing conditions where mode decision is made
without result of motion compensation (as this may likely not be used in practice)? There is agreement
that we should stick with the current anchor settings, as it is beyond our purpose to consider any possible
optimization or restriction in the implementation of the standard.

4.4 CE4: Quantization

4.4.1 Summary

JCTVC-G034 CE4: Summary report of Core Experiment on Quantization [K. Sato, M.


Budagavi, M. Coban, H. Aoki, X. Li (CE coordinators)]
This document reports activity related to Core Experiment on Quantization (CE4).
This CE contained 3 subtests as follows:
 Subtest 1: QP Coding
 Subtest 2: De-quantization process modification
 Subtest 3: Quantization matrices
All mandatory results had been verified by cross-checkers.
Review of subtest 1:
G721 proposes QP control below the CU level (at TU level for TU larger than a particular size) –
emphasis on visual quality (BD rate impact was not an improvement in this CE test, relative to the
anchor). No action – for further study.
G773 proposes finer quant step size granularity (seems similar to proposal G850 in subtest 2 below).

Page: 35 Date Saved: 2011-12-04


Next idea tested is for prediction of QP value: HM uses left prediction unless not available; otherwise last
in decoding order.
Anchor is "Step 3 of TM 5" (region characteristic-based QP control for visual quality – no buffer fullness
feedback).
Proposals include taking into account things like modes, motion vectors, and intra prediction directions.
It was suggested to focus on 1.3.c (G067) and 1.4 as the best possibilities (0.4% benefit), but higher
complexity than current method. But it was pointed out that 1.4 has temporal QP prediction, so it would
have loss resilience issues.
It was remarked that this is emphasizing QP control for visual characteristics, while QP control for buffer
fullness purposes is a different kind of usage – in that usage, predicting from the last QP in decoding
order might be better.
It was suggested to make the qp prediction modal – having one mode that is more friendly to buffer
control (changing QP and having that "stick" permanently).
Straw man: two modes (SPS level selection):
 Always using the preceding CU QP in coding order (+0.2%, measured previously).
 other mode is "1.3.c" (−0.4%) or HM 4 (0.0%), which one TBD.
After discussion, the group wanted to simplify this to just use the preceding QP in scan order. There was
then a submission of G1028, and after further discussion, it was agreed to keep the WD and software
stable for this meeting and to test two schemes in CE work:
 Always use the preceding CU QP in scan order
 HM 4 with reset to preceding CU QP in scan order at LCU boundary
 G1028
Review of subtest 2: Three proposals in this subtest:
 AQO (JCTVC-G278), one offset per colour component per frequency (offset is scaled by
quantization weighting matrix entry and by step size as a function of QP)
 ARL (JCTVC-G382)
 Fine-QP (JCTVC-G843) – encoder can customize the scaling difference for a QP increment.
These proposals, along with a non-CE proposal, a modified Fine-QP scheme (JCTVC-G850), were tested
in four ways.
Test results were not provided for some of the cases that were more difficult to test.
Comments in cross-check document JCTVC-G403:
 Most of the gain was from non-normative modifications.
 Did not seem to work with low QP. It was remarked that G278 did not exhibit this phenomenon.
In the other two cases, the proponents had worked to address this issue with an encoder-only
modification for G382, and with the modified fine QP proposal G850.
 Effects on luma and chroma somewhat different. Chroma gain is larger than luma gain.
It was suggested to emphasize consideration of JCTVC-G382 and JCTVC-G850.
Gain from normative aspect was estimated at 0.4% for luma and 4% for chroma.
Complexity impact (with quant matrix)? (Any dynamic range impact?)

Page: 36 Date Saved: 2011-12-04


A BoG (chaired by M. Budagavi) was asked to study the complexity impact, and the subject was
discussed again. Revisit to consider which proposal and complexity impact and revisit (BoG M.
Budagavi). Also look atsee notes for G773.
The BoG report commented that an additional Intra/Inter parameter is needed for G382. G382 also uses
sgn() and abs() in the dequant equation. No consensus was reached in the BoG on the complexity impact
of these. This was further discussed and effect of this was not considered to be significant.

Comment: QP scaling is being done at a finer level. Will this be required if quantization matrices are
used? Flat quantization matrices can achieve QP scaling at a finer level.
Comment: Modified version of JCTVC-G382 was cross-checked in JCTVC-G1045 (RDO-Q On BD-Rate
matches, code was studied). Modified version of JCTVC-G850 was cross-checked in JCTVC-G1040
(BD-Rate match was observed, code was studied).
Comments: JCTVC-G403 cross-checker commented updated version of their document asserts that all of
the gain reported in JCTVC-G382 could be achieved by non-normative modifications only.
Comments: First test: Second bit-allocation in cross-check document JCTVC-G403 asserted to be not
constrained within 2%. CE submissions were reported to be within 2%.Second test: Encoder only test
reports more gain with two pass algorithm.
Comment: Concerns expressed with regards to complexity at slice level. In worst case bitstreams,there
could be many slices in bitstreams.
Suggestions: Study RDOQ off case. What is the complexity impact? On decoder side and on encoder
side. RDO-On case:
For further study in CE.

Comments: Interesting to study period of 8 and period of 16 for quantizer.

JCTVC-G773:
Comments: Asserted that dQP rate could increase when QP starts changing.
Comments: Introduces functionality that allows for QP to change at a finer scale.
Comments: Asserted G773 could be implemented using G850.
Comments: G773 was tested with perceptual quantization and not bit-rate control.
Comments: Provides new functionality but functionality is not proven.
Comments: In some implementations, this could lead to doubling of quant matrix tables.
For further study in CE.

Subtest 3: Quantization matrices


Three methods evaluated:
 Same as AVC
 JCTVC-G083
 JCTVC-G434

Page: 37 Date Saved: 2011-12-04


It was recommended to adopt (slightly simplified) AVC method (SPS and PPS) as described in G434 as a
starting point (no text! – T. Suzuki was asked to provide – then revised version of G1016 -v4 provided
this). For non-square cases, use the even numbered column entries of the next larger square matrix.
Decision: Agreed.
Should we do something about big block sizes? For further study.

4.4.2 Contributions

Subtest 1 (delta QP)

JCTVC-G721 CE4 Subtest 1.1.a: QP adaptation at sub-CU level [Xue Fang, Jae Hoon
Kim, Krit Panusopone, Limin Wang (Motorola Mobility)]

JCTVC-G071 CE4 Subset 1.1.a: Cross-verification of delta QP coding at sub_CU level


from Motorola (JCTVC-F577) [R. Sjoberg, J. Sun (Ericsson)]

JCTVC-G113 CE4 Subtest 1.1.a: Cross-verification of Motorola’s QP adaptation at


sub_CU level (JCTVC-F577) [K. Chono, H. Aoki (NEC)]

JCTVC-G462 CE4 subset 1.2.a: results of Max/Min QP signaling [K. Sugimoto, A.


Minezawa, S. Sekiguchi (Mitsubishi)]

JCTVC-G054 CE4 Subtest1.2.a: Study and cross-verification of Mitsubishi's signaling of


Max and Min QP in slice (JCTVC-F174) [K. Chono, H. Aoki (NEC)]

JCTVC-G070 CE4 Subset 1.2.b: Ericsson's table-based delta QP coding method [R.
Sjoberg, J. Sun (Ericsson)]

JCTVC-G055 CE4 Subtest 1.2.b: Cross-verification of Ericsson's table-based delta QP


coding method (JCTVC-G70) [K. Chono, H. Aoki (NEC)]

JCTVC-G773 CE4 Subtest 1.2.c: Higher granularity of quantization parameter scaling [T.
Lee, J. Chen, J. H. Park (Samsung), K. Chono (NEC)]

Page: 38 Date Saved: 2011-12-04


JCTVC-G508 CE4: cross-check result of subtest 1.2.c [K. Sato (Sony)]

JCTVC-G363 CE4: Improvement of delta-QP coding (1.3.a) [J. Xu, K. Kondo, K. Sato, A.
Tabatabai (Sony)]

JCTVC-G755 CE4: Crosscheck of Sony's improvement of delta-QP coding [X. Zheng


(HiSilicon)] [late]

JCTVC-G066 CE4 Subtest 1: Spatial QP prediction based on intra prediction (test 1.3.b)
[H. Aoki, K. Chono (NEC), M. Kobayashi, M. Shima (Canon)]

JCTVC-G460 CE4 subset 1.3.b: crosscheck of QP prediction based on intra prediction [K.
Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)]

JCTVC-G067 CE4 Subtest1: test 1.3.c [H. Aoki, K. Chono (NEC), M. Kobayashi, M.
Shima (Canon), K. Sato (Sony)]

JCTVC-G313 CE4 Subtest 1.3.c: cross-verification of JCTVC-G067 [Y. Yasugi (Sharp)]

JCTVC-G731 CE4 Subtest 1.3.c: cross-verification of JCTVC-G067 [M. Coban


(Qualcomm)] [late]

JCTVC-G728 CE4 Subtest 1: Spatial QP prediction (test 1.3.e): combination of test 1.3.b
and test 1.3.d [M. Coban, M. Karczewicz (Qualcomm)]

JCTVC-G726 CE4 Subtest 1.3.d: Cross-check report for Qualcomm's JCTVC-G728 by


Motorola Mobility [K. Panusopone (Motorola Mobility)]

JCTVC-G073 CE4: Cross-check of Subtest 1.3.e - Spatial QP prediction [C. Yeo (I2R)]

JCTVC-G068 CE4 Subtest1: QP prediction based on intra/inter prediction (test 1.4) [H.
Aoki, K. Chono (NEC), M. Coban, M. Karczewicz (Qualcomm)]

Page: 39 Date Saved: 2011-12-04


JCTVC-G461 CE4 subset 1.4: crosscheck of QP prediction based on intra/inter prediction
with QP buffer compression [K. Sugimoto, A. Minezawa, S. Sekiguchi
(Mitsubishi)]

Subtest 2 (de-quantization offset)

JCTVC-G278 CE4 Subtest 2: Adaptive-Dequantization Offset [X. Li, X. Guo, S. Lei


(MediaTek)]

JCTVC-G901 CE4 Subtest 2.1.b: Cross-verification of JCTVC-G278 [M. Yang (Huawei)]


[late]

JCTVC-G074 CE4: Cross-check of Subtest 2 on de-quantization offset (2.1.a, 2.3.d) [C. Yeo
(I2R)]

JCTVC-G140 CE4: Subtest 2.1.c Cross Check Report for Mediatek's AQO CaseC by
HKUST [F. Zou, O.C. Au (HKUST)]

JCTVC-G660 Cross-check of MediaTek's CE4 Subtest2.1 Adaptive Quantization Offset


[X. Yu, D. He (RIM)]

JCTVC-G382 CE4 Subtest-2 Adaptive Reconstruction Levels [X. Yu, J. Wang, D. He, G.
M.Cocher, E. Yang (RIM)]

JCTVC-G141 CE4: Subtest 2.2.d Cross Check Report for RIM's ARL Case D by HKUST
[F. Zou, O.C. Au (HKUST)]

JCTVC-G403 CE4-Subtest-2 Cross-check of Adaptive Reconstruction Levels for


Quantization Design (JCTVC-G382) [B. Li (USTC), J. Xu, G. J. Sullivan
(Microsoft)]

JCTVC-G902 CE4 Subtest 2.1.b: Cross-verification of JCTVC-G382 [M. Yang(Huawei)]


[late]

Page: 40 Date Saved: 2011-12-04


JCTVC-G557 CE4.2.A: Cross-check of RIM's ARL [X. Li, X. Guo (MediaTek)] [late]

JCTVC-G823 CE4: Cross-check for ARL from RIM by Samsung [E. Alshina, J.H. Park]
[late]

JCTVC-G843 CE4: Fine granularity QP offset [X. Wang, R. Joshi, G. Auwera, M.


Karczewicz (Qualcomm)]

JCTVC-G560 CE4.3.A: Cross-check of Qualcomm's fine-QP [X. Li, X. Guo (MediaTek)]


[late]

JCTVC-G662 Cross-check of QualCom's CE4 Subtest2.3 Fine QP [X. Yu, J. Wang, D. He


(RIM)]

JCTVC-G400 CE4: Equal expected magnitude (EEM) encoder-only adaptive quantization


rule for HEVC [B. Li (USTC), J. Xu, G. J. Sullivan (Microsoft)]
This contribution presents experiment results for a quantization method referred to as equal expected
magnitude (EEM) quantization. The basic idea is from JVT-N011 (January 2005). When RDOQ is not
used by the encoder, the experimental results reportedly show that the EEM quantization scheme can
provide about a 6% bit rate savings for the low QP range of 2~17. As RDOQ substantially slows down
the encoding process, it is suggested that EEM offers a useful alternative to RDOQ.
There is some difference in the described scheme relative to the original EEM proposal of JVT-N011.
The overall gain reported is about 0.55% (AI: 1.4%, RA: 0.7%, LB: 0.0%, LP: 0.1%) relative to RDOQ
turned off for the common conditions. For the low QP range, the gain is substantially larger (AI: 2.2%,
RA: 6.7%, LB: 6.8%, LP 5.6%).
RDOQ performs somewhat better (low QP results: AI: 3.7%, RA: 8.6%, LB: 9.3%, LP: 9.6%), but slows
down the encoder substantially (17-38% slower).
It was remarked that the non-normative part of G382, as discussed in G403 (except for the low QP range),
performs better than this and also does not significantly slow down the encoder, so maybe that should be
put into the reference software instead of this.
Additionally, it was remarked that the two techniques can be used together to provide even more gain
(perhaps 0.5% for the common conditions).
It was agreed to adopt the non-normative part of G382 into the reference software.
Question: Should it be in the common conditions (together with RDOQ enabled)? Yes. Agreed.

Page: 41 Date Saved: 2011-12-04


Subtest 3 (quant matrices)

JCTVC-G083 CE4: Test results on compact representation of quantization matrices [M.


Zhou, V. Sze (TI)]

JCTVC-G306 CE-4 Subset-3: Cross-check for TI’s Proposal (G083) on Quantization


Matrix [Z. Zhou, X. Zhang (MediaTek)]

JCTVC-G503 CE4 Subtest 3.2: Crosscheck report of compact representation of


quantization matrices proposed by TI (JCTVC-G083) [M. Shima (Canon)]

JCTVC-G528 CE4: Cross-Check report for CE4 subset3 of TI compact representation of


quantization matrices (JCTVC-G083) [J. Zheng (HiSilicon)]

JCTVC-G434 CE4 subtest 3: Quantization matrix for HEVC based on JCTVC-F362 and
F475 [Y. Morigami, J. Tanaka, T. Suzuki (Sony)]

JCTVC-G303 CE-4 Subset-3: Cross-check for Sony’s Proposal (G434) on Quantization


Matrix [Z. Zhou, X. Zhang (MediaTek)]

JCTVC-G502 CE4 Subtest 3.1: Crosscheck report of AVC based quantization matrix
support (JCTVC-G434) [M. Shima (Canon)]

JCTVC-G505 CE4 Subtest 3.3: Crosscheck report of Enhancement of quantization matrix


coding of HEVC proposed by Sony (JCTVC-G434) [M. Shima (Canon)]

JCTVC-G527 CE4: Cross-Check report for CE4 subset3 of Sony proposal on Quantization
matrix for HEVC (JCTVC-G434) [J. Zheng (HiSilicon)]

Page: 42 Date Saved: 2011-12-04


4.4.3 Discussion and Conclusions

4.5 CE5: CAVLC entropy coding improvement

4.5.1 Summary

JCTVC-G035 CE5: Summary report on CAVLC entropy coding improvements [X. Wang,
P. Wu, C. Kim (CE Coordinators)]

4.5.2 Contributions

JCTVC-G310 CE5: CAVLC Adaptation using Difference Counter [T. Yamamoto (Sharp)]

JCTVC-G402 CE5: Cross-check of CAVLC Adaptation using Difference Counter


(JCTVC-G310) [B. Li (USTC), J. Xu (Microsoft)]

JCTVC-G360 CE5: Redundancy removal for Run-mode in CAVLC (JCTVC-F286) [J. Xu,
A. Tabatabai (Sony)]

JCTVC-G389 CE5: CAVLC coding table modification [S. Kim, J. Lee, S. Lee (Yonsei
Univ.), C. Kim, Y. Park, J. Park (Samsung)]

JCTVC-G851 CE5: Crosscheck of Yongsei and Samsung's contribution (JCTVC-G389) on


CAVLC coding table modification [X. Wang (Qualcomm)] [late]

JCTVC-G532 CE5 2.1 : Improvement of CAVLC run- coding by prediction mode [C. Kim,
Y. Park, K.P.Choi (Samsung)]

JCTVC-G367 CE5: cross-check for Samsung’s CAVLC (JCTVC-G532) [J. Xu, M. Haque
(Sony)]

JCTVC-G563 CE5 2.2 : Handling for exception cases longer than 32bit code-word in
CAVLC [C. Kim, Y.Park, K.P. Choi(Samsung), M. Karczewicz, X. Wang,
W.-J. Chien, L. Guo(Qualcomm)]

Page: 43 Date Saved: 2011-12-04


JCTVC-G841 CE5.2.2: Crosscheck for Samsung and Qualcomm's handling for exception
cases longer than 32bit code-word in CAVLC in JCTVC-G563 [T.-D.
Chuang, Y.-W. Huang (MediaTek)] [late]

JCTVC-G674 CE5: Sub-block coding of transform coefficients with CAVLC [M.


Karczewicz, L. Guo, Y. Zheng, X. Wang (Qualcomm)]

JCTVC-G340 CE5: Cross-check of CAVLC coefficients coding (JCTVC-G674) [T.


Yamamoto (Sharp)]

JCTVC-G677 CE5: Limitation on VLC codeword length [M. Karczewicz, X. Wang, W.J.
Chien, L. Guo (Qualcomm)]

JCTVC-G924 CE5: Cross-verification of Qualcomm’s proposal (JCTVC-G677) [C. Kim


(Samsung)] [late]

4.5.3 Discussion and Conclusions

4.6 CE6: Intra prediction improvements

4.6.1 Summary

JCTVC-G036 CE6: Summary Report on Intra Coding Improvements [A. Tabatabai, E.


Francois, K. Chono, H. Yu, R. Joshi, J. Lainema]
This document provides a summary report of the Core Experiments (CE6) results on “Intra prediction and
improvements”. The experiments are divided into 4 subsets, as indicated below:
 Subset CE6a: Intra Chroma Prediction
o CE6a.1 Modified down-sample filters T0L0I0 vs T0L2I2 some gain but longer filters
o CE6a.2 Alpha and beta calculation complexity reduction
o CE6a3 Reduction of Storage for Reconstructed Luma Pixels
 Subset CE6b: Intra Mode Coding
o CE6b.1 MPM derivation
o CE6b.2 Remaining mode coding
o CE6b.3 Binarization of remaining mode coding
o CE6b.4 Some different chroma stuff
o CE6b.5 Combination of the above ideas G243 (AI HE 0.5% both in luma and chroma,
most of which is from the first two techniques)

Page: 44 Date Saved: 2011-12-04


It was suggested for a BoG [V. Sze] to dig into the details and report back on this and
related non-CE proposals (G153, G145, G184, G119, G359, G358, G423)
o CE6b.6 DCIM: Has some gain but substantial encoder complexity (16% runtime
increase) – no action.
 Subset CE6c: Short Distance Intra Prediction (SDIP)
o CE6c.1 HM 4 + SDIP (28% runtime increase)
o CE6c.2 Fast non-square PU selection (encoder-only relative to CE6.2)
o CE6c.3 NSQT related
o CE6c.4 through CE6c.11 Several other SDIP related modifications
 Subset CE6ed: Intra Prediction with secondary boundary

CE6a.1 Modified down-sample filters: Best (T0L0I0 vs T0L2I2) has some gain (0.1% for luma and 0.3%
for chroma) but longer filters. Seems like not enough gain.
CE6a.2 Alpha and beta calculation complexity reduction: Almost no loss was observed for 2:1
subsampling 16x16. It was commented that this does not help since it makes a special case out of a case
that is not the worst case – whereas the worst case is 4x4.
It was noted that there is a relevant non-CE contribution by Canon (JCTVC-G244).
CE6a.3 Reduction of Storage for Reconstructed Luma Pixels – noting that the focus is now on the 8 bit
case, which already is using 8 bit storage in this case – leaving it alone sounds reasonable.
Note also that the focus is now on the CABAC case – CAVLC is no longer particularly interesting.
CE6c. SDIP
Case 1: Performance of SDIP on HM AI HE: −1.35% (127% enc RT)  −2.14% (with class F)
Case 2: Each rectangular sub-PU are using different prediction mode −0.96% (113% enc RT)  −1.70
(with class F). case 2 is optimized mode selection vs. case 1
Case 3: All rectangular sub-PU are using same prediction mode for entire CU −0.96% (114% enc RT) 
−1.39 (with class F)
Q: How are NSQT concerns on 8x2/2x8 throughput and coefficient scanning addressed as they also
apply to SDIP? Expert comment that throughput for intra prediction may be more of an issue than inter
prediction.
A: 16 sample block unit read for both 4x4 and 2x8/8x2.
Several experts noted gains desirable but requires discussion with hardware expert on complexity;
multiple functions affected.
SDIP vs NSQT – SDIP has additional TUs, prediction affected
Similarity in spirit of SDIP and NSQT. Suggested separate profile to contain SDIP, NSQT, ALF, & SAO
(?)
It was noted that, at the moment, NSQT does not have a syntax flag to disable its selection.
G556 was suggested as the primary document to review for study of SDIP.
Other SDIP related proposals that add about 0.4% (for luma AI) additional compression benefit (1.7% for
chroma). There is a survey of this in G558.
Non-CE related SDIP-related G135, G354, G598.

Page: 45 Date Saved: 2011-12-04


CE6ed: Intra Prediction with secondary boundary handling
Various particular methods proposed.
Overall gain roughly 0.7% AI for the best-performing several of these (in terms of coding efficiency).
Some loss in some class F cases.
Filtering of one column or one row is the common theme between several of these approaches.
We already have a filtering for the DC mode. It was commented that this filtering makes that mode more
difficult to vectorize.
Combinations 10 (combining 1, 7, and 8) and 13 were the primary group focus suggested during the
discussion of the CE report. Combination 13 is more complex and has very slightly (0.1%) better
compression. Only the proponent seemed to emphasize an interest in combination 13.
There are some related non-CE contributions with some possibilities for further improvement (e.g. for
class F).
After further consideration, there seemed to be no problem with combination 10 relative to combination
13.
It was then suggested by a proponent to consider combination 11 as a less complex alternative to
combination 10 It was then commented that combination 11 is not necessarily actually less complex.
In regard to the vectorization comment recorded above, it was commented that a transpose operation
would be needed, and that the filtering for the planar mode was more of a complexity concern. The
benefit of the filtering proposed to be added for the planar mode (test #8) was reported in the CE report as
0.1%. The filtering proposed for the planar mode is the same as the filtering applied for the DC mode.
The horizontal and vertical filtering gains alone (test #1), at 0.3%, provide roughly half of the total gain
that is reported to be available, based on F172. Decision: Adopt this aspect.
Adding filtering for other various angles, JCTVC-F456 (test #7) has about 0.2% additional gain, and it
applied the same type of filter to all of those other angles. There was very small gain in JCTVC-F456 for
the "negative angles".
The π/4 (45º) diagonal case was also noted to be simpler to generate if treated as a special case in the
decoder. However, it was suggested to avoid having special cases for different modes, such that all
directions from 13º to 32º could be treated the same way by a decoder. If treated as a separate case, the
special case treatment may need to be repeated for each block size.
No further action taken.

SDIP plenary discussion was held on Monday 28th (chaired by J.O.), with notes recorded as follows:
 1.3% gain with the current version
 SDIP should be harmonized / unified with SDIPNSQT, which is not achieved yet
 Problem: So far NSQT and SDIP were discussed in different places
 Set up AHG to harmonize, with intention to have the unified solution in the standard by the next
meeting.
 Basis for AHG is the “SDIP reference” G558+G754. This is not to be further studied in CE. New
SDIP proposals that build on top of this are to be investigated in CE6 (not in AHG)

Page: 46 Date Saved: 2011-12-04


4.6.2 Contributions

Experiment A

JCTVC-G129 CE6.a: Sub-sampling portion of neighboring pixels in calculation of LM


parameters [M. Budagavi (TI), K. Sato (Sony)]

JCTVC-G297 CE6: Cross-check report for CE6a TI&Sony's proposal on Sub-sampling


portion of neighboring pixels in calculation of LM parameters [H. Sasai, T.
Matsunobu, T. Nishi (Panasonic)]

JCTVC-G172 CE6a: Modified down-sampling filter for LM mode of intra chroma


prediction [Y. Chiu, Y. Han, L. Xu, W. Zhang, H. Jiang (Intel)]

JCTVC-G469 CE6a: Cross-check on LM mode of intra chroma prediction(JCTVC-G172)


[K. Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)] [late]

JCTVC-G455 CE6: results of filtering for LM_mode [A. Minezawa, K. Sugimoto, S.


Sekiguchi (Mitsubishi)]

JCTVC-G168 CE6: Cross-check report for Subtest CE6a on Intra Chroma Prediction [S.
Cho, S. Lee (ETRI)] [late]

JCTVC-G169 CE6: Cross-check report for Subtest CE6a on Intra Chroma Prediction [S.
Cho, S. Lee (ETRI)] [late]

JCTVC-G510 CE6.a: Result of Subtest 4.1.2.2 [K. Sato (Sony)]

JCTVC-G294 CE6: Cross-check report for CE6a Sony's proposal on Improvement to


chroma intra prediction from luma [H. Sasai, T. Matsunobu, T. Nishi
(Panasonic)]

JCTVC-G512 CE6: combination of subtest 4.1.2.1 & 4.1.2.2 [K. Sato (Sony)] [late]

JCTVC-G847 CE6: Cross-check results for combination of subtest 4.1.2.1 & 4.1.2.2
(JCTVC-G512) [S. Cho, S. Lee, N. Eum] [late]

Page: 47 Date Saved: 2011-12-04


Experiment B

JCTVC-G192 CE6b: Intra remaining mode coding with mode ranking [J. Park, B. Jeon
(LG)]

JCTVC-G203 CE6b: Intra prediction mode coding [T.-D. Chuang, C.-Y. Chen, M. Guo, X.
Guo, Y.-W. Huang, S. Lei (MediaTek), W.-J. Chien, X. Wang, M.
Karczewicz (Qualcomm)]

JCTVC-G186 CE6: Cross-verification of MediaTek and Qualcomm's proposal (JCTVC-


G203) by JVC KENWOOD [T. Kumakura, S. Fukushima (JVC Kenwood)]

JCTVC-G252 CE6b: Cross-check of JCTVC-G203 on intra mode coding from


Qualcomm/MediaTek [E. François (Canon)] [late]

JCTVC-G242 CE6b: Mode ranking for remaining mode coding with 2 or 3 MPMs [E.
François, S. Pautet, C. Gisquet (Canon)]

JCTVC-G243 CE6b: Intra mode coding with 4 MPMs and mode ranking [E. François, S.
Pautet (Canon), Joonyoung Park, Byeongmoon Jeon (LG), Tzu-Der Chuang,
Ching-Yeh Chen, Mei Guo, Xun Guo, Yu-Wen Huang, Shawmin Lei
(MediaTek), Wei-Jung Chien (Qualcomm), Ehsan Maani, Ali Tabatabai
(Sony)]

JCTVC-G311 CE6b: Cross-check of Intra Mode Coding (JCTVC-G242 and JCTVC-G243)


[T. Yamamoto (Sharp)]

JCTVC-G869 CE6: Combinations of MPM derivation and remaining mode coding [Ehsan
Maani, Ali Tabatabai] [late]

JCTVC-G080 CE6: Cross-check report for Subtest CE6b on Intra Mode Coding [H. L.
Tan, C. Yeo, Y. H. Tan (I2R)]

JCTVC-G167 CE6: Cross-check report for Subtest CE6b on Intra Mode Coding [S. Cho,
S. Lee, N. Eum (ETRI)] [late]

Page: 48 Date Saved: 2011-12-04


JCTVC-G276 CE6: Cross-Check report for CE6b subtests of Intra Mode Coding (F062,
F459, F091, F106, F269)) [G. Li, N. Ling (Santa Clara Univ.), L. Liu, J.
Zheng, P. Zhang (Huawei)]

JCTVC-G868 CE6: Test results of DCIM [Ehsan Maani, Ali Tabatabai, Tomoyuki
Yamamoto] [late]

JCTVC-G789 CE6.b Crosscheck for DCIM [C. Lai] [late]

Experiment C

JCTVC-G142 CE6.c: LM mode harmonization on SDIP [J. Lim, B. Jeon (LG)]

JCTVC-G476 CE6.c: Crosscheck for LG's LM mode harmonization on SDIP in JCTVC-


G142 [T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G143 CE6.c: VLC improvement for intra partitioning on SDIP [J. Lim, B. Jeon
(LG)]

JCTVC-G478 CE6.c: Crosscheck for LG's VLC improvement for intra partitioning on
SDIP in JCTVC-G143 [T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G800 CE6.c Crosscheck report for LG's JCTVC-G142 and JCTVC-G143 [C. Lai]
[late]

JCTVC-G267 CE6.c Report on SDIP chroma extension scheme [J. Song, C. Lai, H. Yang,
H. Yu (Huawei)]

JCTVC-G836 CE6.c: Cross-verification of Huawei’s SDIP chroma extension scheme


(JCTVC-G267) [T. Lee, J. Chen, J. H. Park (Samsung)]

JCTVC-G322 CE6.c: Harmonization of HE residual coding with non-square block


transforms [J. Sole, R. Joshi, X. Wang, M. Karczewicz (Qualcomm)]

Page: 49 Date Saved: 2011-12-04


JCTVC-G797 CE6.c Crosscheck report for JCTVC-G322 [C. Lai (HiSilicon)] [late]

JCTVC-G890 CE 6: Cross-check of G322 on harmonization of non-square residual coding


[Marta Mrak, Andrea Gabriellini (BBC) [late]

JCTVC-G556 CE6.c Report on Simplification of Short Distance Intra Prediction (SDIP)


Method [X. Cao, Y. He (Tsinghua), X. Peng (USTC), C. Lai, L. Liu, J. Zheng
(HiSilicon), J. Xu (Microsoft), H. Yang, H. Yu (Huawei)]

JCTVC-G558 CE6.c Report on Combination of SDIP and Its Improvements [X. Cao, Y. He
(Tsinghua), X. Peng (USTC), C. Lai, L. Liu, J. Zheng (HiSilicon), J. Xu
(Microsoft), H. Yang, J. Song, H. Yu (Huawei), J. Lim, B. Jeon(LGE), J.
Sole, R. Joshi, X. Wang, M. Karczewicz (Qualcomm), J. Xu, E. Maani, A.
Tabatabai (Sony)]
Contains multiple topics, some of it was in the CE and some is different.

JCTVC-G895CE6: Cross-verification of Combination of SDIP and Its Improvements


(JCTVC-G558) [K. Chono, H. Aoki (NEC)] [late]

JCTVC-G260 CE6.c: cross-check of case 1, 2 and 3 (Huawei JCTVC-Gxxx) J. Jung


(Orange Labs) [late]

JCTVC-G369 CE6.c: cross-check for SDIP (JCTVC-F532) [J. Xu, A. Tabatabai (Sony)]

Experiment D

JCTVC-G279 CE6 Subtest d: direction-based angular intra prediction [M. Guo, X. Zhao,
X. Guo, S. Lei (MediaTek)] [late]

JCTVC-G280 CE6 Subtest d: Intra Prediction with Secondary Boundary [M. Guo, X. Guo,
X. Zhao, S. Lei (MediaTek), J. Lainema, K. Ugur (Nokia), K. Sugimoto, S.
Sekiguchi, A. Minezawa (Mitsubishi), J. Lee, S.-C. Lim, H.Y. Kim, J.S. Choi
(ETRI)]

JCTVC-G420 CE6.d: Results of experiment 4.4.3 [J. Lee, S.-C. Lim, H. Y. Kim (ETRI)]

Page: 50 Date Saved: 2011-12-04


JCTVC-G457 CE6: Results of intra vertical and horizontal prediction [A. Minezawa, K.
Sugimoto, S. Sekiguchi (Mitsubishi)]

JCTVC-G565 CE6.d: Nokia report on intra prediction with secondary boundary [J.
Lainema, K. Ugur (Nokia)]

JCTVC-G081 CE6: Cross-check report for Subtest CE6d on Intra prediction with
secondary boundary (Test 7) [H. L. Tan, Y. H. Tan, C. Yeo (I2R)]

JCTVC-G293 CE6.d: Cross Check of Category 4 on Tool Combinations for Intra


Prediction with Secondary Boundary [G. Van der Auwera (Qualcomm)]

JCTVC-G361 CE6.d: Crosscheck of JCTVC-F122 on gradient based intra prediction for


vertical and horizontal set secondary boundary intra prediction from
MediaTek [C. Auyeung (Sony)]

JCTVC-G436 CE6.d category 2: Cross-checking reports of intra prediction with secondary


boundary (JCTVC-F172 and JCTVC-F456) [A. Tanizawa, T. Shiodera
(Toshiba)]

JCTVC-G561 CE6d: Cross-check report on Intra prediction with secondary boundary


(Test 7) (JCTVC-F456) [T Guionnet, L Guillo]

JCTVC-G790 CE6.d crosscheck [C. Lai] [late]

4.6.3 Discussion and conclusions

4.7 CE7: Additional transforms

4.7.1 Summary

JCTVC-G037 CE7: Summary report of Core Experiment on additional transforms [R.


Cohen, C. Yeo, R. Joshi, F. Fernandes] (CE coordinators)
The purpose of Core Experiment 7 (CE7) is to characterize the performance, in terms of both
compression efficiency and complexity, of several transforms other than those currently defined in HM
4.0. Three tools that operate on Intra blocks have been evaluated: A mode-dependent secondary transform
applied for luma after the HM core transform (a unified Tool 1 and 2 from the CE description JCTVC-

Page: 51 Date Saved: 2011-12-04


F907), a secondary rotational transform applied for luma after the HM core transform (Tool 3), and a
mode-dependent DCT/DST for chroma (Tool 4), which replaces the HM core transform for chroma. This
document provides a summary of activities and results for this core experiment.
Tools 1 and 2 were already similar – now a "unified" G108 secondary transform proposed (mode-
dependent selection of whether to apply a horizontal or vertical secondary transform, or both)
Tested with various sizes of secondary transform: performance in AI configuration saturates at 6x6
with 0.7% luma benefit, 4x4 transform (applied to intra block sizes from 8 to 32) has 0.5% luma
benefit. In RA, 4x4 has 0.2% luma benefit. (Can be further extended to inter also, as reported in
G632, and can be combined additively with SDIP as reported G629.)
It has a significant complexity impact.
It was remarked that the 8x8 case would be roughly as complex (or more complex) as the 32x32 case.
Gain is concentrated on higher-resolution sequences.
It was commented that latency is increased in some variations of this scheme; this is discussed in
G108, which indicated that there is no additional latency in some hardware implementations with the
4x4 secondary transform.
The possibility of applying this only for 16x16 and 32x32 block sizes.
It was commented that for the 8x8 case, it would be faster to just apply one transform rather than a
cascade of two of them.
No action due to concern over complexity and gain specific to AI use.
Further study as CE, including combination with inter.
Tool 3 G304 secondary rotational transform (switched with syntax)
More encoder complexity for searching than tools 1 & 2, 0.9% luma benefit.
Tool 4 G107 mode-dependent 4x4 DCT/DST for chroma (like what we have for luma).
In spirit, this is the same thing that we do for luma. Various particular decision-making methods were
tested.
Gains generally in range of 0.5‒0.7% in chroma only, helps in both RA and AI.
Test 5 seems the most straightforward.
Due to the desire for stability and relative unimportance of chroma gains, no action taken.

4.7.2 Contributions

JCTVC-G108 CE 7: On secondary transforms for intra prediction residual [A. Saxena


(Samsung), Y.Shibahara (Panasonic), F. Fernandes (Samsung), T.Nishi
(Panasonic)]

JCTVC-G1018 Cross-check of modified JCTVC-G108 on secondary transforms for intra


prediction residual [V. Seregin, R. Joshi (Qualcomm)] [late]

Page: 52 Date Saved: 2011-12-04


JCTVC-G075 CE7: Cross-check of Unified Tools 1 & 2 - Mode-Dependent Secondary
Transform [C. Yeo, Y. H. Tan, Z. Li (I2R)]

JCTVC-G425 CE7: Cross-check of the unified design on mode-dependent secondary


transform (JCTVC-G108) by Huawei [H. Yang, H. Yu (Huawei)]

JCTVC-G581 CE7: Crosscheck of combination of tool 1 and tool 2 for mode dependent
secondary transform sizes 3x3 and 4x4 (JCTVC-G108) [R. Joshi
(Qualcomm)]

JCTVC-G814 CE7: Cross verification of JCTVC-G108, On secondary transforms for intra


prediction residual [R. Cohen (MERL)]

JCTVC-G888 CE 7: Crosscheck of G108 on secondary transforms [Marta Mrak, Andrea


Gabriellini (BBC)] [late]

JCTVC-G930 CE7: Cross check report of “On secondary transforms for intra prediction
residual (G108)” [A. Ichigaya, (NHK)] [late]

JCTVC-G304 CE 7: Experimental Results for the ROT [Z. Ma, F. Fernandes, E. Alshina,
A. Alshin (Samsung)]

JCTVC-G375 CE7: Cross Check Report for CE7 Tool 3, Rotational Only Transform [Y.
Shibahara, T. Nishi (Panasonic)]

JCTVC-G580 CE7: Crosscheck of rotational transform (JCTVC-G304) [R. Joshi


(Qualcomm)]

JCTVC-G734 CE 7: Crosscheck of G304 ROT Transform [P. Topiwala (FastVDO)]

JCTVC-G107 CE 7: Mode-Dependent DCT/DST for 4x4 Chroma Blocks [A. Saxena, F.


Fernandes, E. Alshina, J. Chen (Samsung)]

Page: 53 Date Saved: 2011-12-04


JCTVC-G076 CE7: Cross-check of Tool 4 - DST for Intra Chroma Residual Coding [C.
Yeo, Y. H. Tan, Z. Li (I2R)]

JCTVC-G376 Cross-check report of CE7 Tool 4, Mode-Dependent DCT/DST for 4x4


Chroma Blocks (G107) [Y. Shibahara, T. Nishi (Panasonic)]

4.7.3 Discussion and Conclusions

4.8 CE8: Non-deblocking loop filtering

4.8.1 Summary

JCTVC-G038 CE8: Summary report of Core Experiment on non-deblocking loop filtering


[T. Yamakage, I. S. Chong, M. Narroschke, Y.-W. Huang]
A summary of core experiment 8 (CE8) on non-deblocking loop filtering is reported. There are eight
Subtests in CE8, Subtest a: Block level filter adaptation with directional feature (three proposals), Subtest
b: Sub-frame delay adaptive loop filter (one proposal), Subtest c: Line Memory reduction in ALF and
SAO decoding (seven proposals), Subtest d: Filter shapes and coefficient constraints, for both luma and
chroma (three proposals), Subtest e: Prediction of filter coefficients (three proposals), Subtest f: Filter
switching reduction at LCU boundary for LCU friendly decoding (one proposal), Subtest g: Chroma filter
control (one proposal) and Subtest h: Other in-loop filters (one proposal). These are evaluated based on
the common test conditions in JCTVC-F900 and additional conditions in JCTVC-F908. All mandatory
results are verified by cross-checkers.
Note: G038 v4 includes a powerpoint deck with the results of the informal viewing.
According to the cross-checkers’ comments and further study such as unification and harmonization in
some subtests (i.e., CE8.a and CE8.c), it is suggested to consider those unification and harmonization to
regard them as additional CE contributions. As for CE8.c, CE8 is planning to conduct an informal
subjective viewing (at least for CE8.c.5, Non-CE8.c.6 and Non-CE8.c.7).
Results on “subtest 0” (encoder only) which shows that 1-pass ALF can be operated with 0.2% bit rate
increase, and 2-pass with 0.1% bit rate increase, compared to the current 14-pass in HM (also G146 refers
to that). Decide to use one of these (or switchable) for upcoming default settings (was reviewed in the
BoG on ALF and decided after that). Adopt the method according to JCTVC-G038 table 2 right column
(two-pass method) for the common test conditions (also JCTVC-G1023).
Subtest A:
Adaptation is on level of 4x4 blocks. G609 (-0.1/-0.2/-0.2/-0.4 for AI/RA/LDB/LDP). Has advantage of
reduced line buffer from 5 to 4 due to reduced vertical filter size. The proposal uses 8 points instead of 4
for BA classification. In BA mode classification, it uses diagonal instead of hor/vert analysis for the star-
shaped filter (hor/vert for cross-shaped still). With proper implementation (re-usage of results) this
increases number of additions by 1.5. G316 (0.0/-0.1/-0.1/-0.1) proposes 2D merging instead of 1D
(imposes syntax change); does this increase latency? G691 is a combination of both which shows that the
gains are additive.
G656 is another combination of tools that was considered in this context.
G647 uses 8 classes in BA classification such that 8 filters are used instead of 16. This produces loss of
0.1%. Opinions: For high resolutions, larger number of classes may be needed. Restrictions could
eventually be considered in the context of level definition

Page: 54 Date Saved: 2011-12-04


Decision: Adopt 4x4 classification, but not the other elements of G609 and G316 which would increase
complexity. Keep number of classes 16 in BA classification.
General note: In ALF, modifications should target complexity reductions mainly; only proposals which
are giving a decent gain/complexity benefit (regardless which direction should be considered); i.e.
complexity decrease should not penalize and complexity increase should give sufficient gain.

Subtest B:
G498 Simplified ALF design. Only 5x5 diamond shape, no pixel classification, no DC offset. More
coarse quantization and coding.
Current implementation does not process chroma and does not include slice boundary processing.
Looking at the loss of 1%, several experts express the opinion that we should not replace the current ALF
by this. Note: G499 is an improved version with lower loss.

Subtest C:
(G212 presented in this context, is combination of G208/G206 (“option 1”), and something new on SAO
LB reduction (“option 2”). Also G211 was considered here.
G208 uses “virtual boundary processing” = specific boundary padding method with some irregularity
which could also be implemented differently e.g. by adjusting filter coefficients
Investigate visual quality of G212 option 1 against HM, adopt when it does not produce artifacts.
20 viewers “Score based method” was used, where experts gave 1 point for anchor and prop each when
they were equal, and otherwise 2/0 or 0/2 if one was better.
(anchor/proposal) BQM 18/22, Cactus 22/18, Vidyo3 15/25, Vidyo4 20/20
Conclusion: No visual difference. Even 15/25 means that still 75% of the subject thought both are equal.
Decision: Adopt G212 option 1
G207 (c.5) proposes a method to perform padding for SAO at slice boundaries (which would obviously
increase the complexity). This shall also be investigated in the subjective test to identify whether there is
a problem with SAO at slice boundaries (in case where across-boundary processing is disabled), but we
would not adopt it at this meeting as it appears inconsistent to have something for SAO but not for de-
blocking where the same problem occurs. There may also be other solutions such as post-processing of
boundaries. (Note G194 is also related). Result of subjective viewing: It looks better than anchor.
Investigate combined solution for ALF and SAO for slice and tile boundaries in CE, for de-blocking
currently no proposal on the table, but a solution which also includes de-blocking would be desirable.

Subtest D:
G208 (9x9 cross shape) performs best and has better performance than G648 (7x11). G208 will be tested
together with G206 such that reducing line buffers by decreasing vertical filter size is not relevant
anymore (same applies to G130 which reduces vertical filter size only for chroma and produces losses)

Subtest E:
(G665) Prediction of filter coefficients from other coefficients of the same filter (instead of from one filter
to the next). Gain of 0.1% observed in LD B and P cases. Note: G610 is a similar idea that provides more
gain.

Page: 55 Date Saved: 2011-12-04


Does not add or remove complexity (or stability). Several experts support it as the design of prediction
from the same filter seems to be more straightforward than from different filter. The gain would increase
when APS is sent more frequently for error resilience. Revisited in context of G610. Decision: Adopt.

Subtest G:
Adds a third mode where within a slice the ALF applied to chroma is invoked whenever luma is filtered
(currently it is either entirely on or off). Only marginal gain (0.2% for chroma only). – also one more
encoder decision.
No support by other companies. No action.
Note: Question was raised what would be the performance when chroma always follows luma, and some
experts expressed that they would like such a solution, however other experts expressed it might be
dangerous to do this, as there may be good reasons to switch it entirely on/off.

Subtest H (G235): NLM filter.


Benefit around 0.1-0.3% for different test cases, encoder/decoder runtime increase up to 12 (enc) / 5 (dec)
%. NLM operated alternatively with ALF. In worst case, runtime increase could be dramatically higher (if
it were used everywhere).
No support by other experts – no action.

4.8.2 Contributions

JCTVC-G1023 CE8 subset 0: Improved ALF N pass encoding [I. S. Chong, M. Karczewicz,
T. Yamakage, T. Watanabe, T. Chujoh, C.-Y. Chen, C.-M. Fu, C.-Y. Tsai,
Y.-W. Huang, S. Lei] [late]
This evaluates a modified ALF N pass encoding algorithm. This includes improvement and bugfix of
ALF N pass encoding. Coding efficiency gain for luma is 0.0 %, 0.0 %, 0.1% and 0.2 % in HE-AI, RA,
LB, and LP without encoding/decoding time increase on average.
(non-normative)

Subtest A

JCTVC-G316 CE8.a.1: 2-D mergeable syntax [T. Ikai (Sharp), I. S. Chong, M. Karczewicz
(Qualcomm), T. Yamakage, T. Watanabe, T. Chujoh (Toshiba), C.-Y. Chen,
C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports evaluation results of 2-D mergeable syntax technique [JCTVC-F384]. This
technique aims to free the restriction on block classification in the HM-4.0 ALF. The coding gains of 0.0
%, 0.1 %, 0.1 % and 0.1 % in HE-AI, RA, LB, and LP were reported. The decoding time ratio was 100 %
to 101 % and the encoding time ratio was 99 % to 100 % in HE case. The proposal was cross-checked by
Samsung (JCTVC-G649). Harmonization of the other proposal “CE8.a.2: Directional feature calculation
on subset of pixels” (JCTVC-G609) is also tested in JCTVC-G649.

JCTVC-G649 CE8 Subtest a: Cross-check of Tool 1 (JCTVC-G316) 2-D mergeable syntax


[P. Lai, F. C. A. Fernandes (Samsung)]

Page: 56 Date Saved: 2011-12-04


JCTVC-G609 CE8.a.2: Directional feature calculation on subset of pixels [T. Yamakage, T.
Watanabe, T. Chujoh (Toshiba), C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W.
Huang, S. Lei (MediaTek), M. Karczewicz, I. S. Chong (Qualcomm), T. Ikai,
A. Segall, T. Yamamoto (Sharp)]
This evaluates a directional feature calculation on subset of pixels to improve coding efficiency. This
technique uses the internal and the surrounding pixels (window pixels) of the 4x4 block unit which have
different complexity and performance trade off. It also evaluates the changes in computational direction
to diagonal directions from horizontal/vertical directions. Coding efficiency gain for luma is 0.1 %, 0.2
%, 0.2% and 0.4 % in HE-AI, RA, LB, and LP without encoding/decoding time increase on average.
When merged with CE8.a.1 the coding efficiency is improved by is 0.2 %, 0.3 %, 0.3% and 0.4 % in HE-
AI, RA, LB, and LP.

JCTVC-G464 CE8 subtest a tool 2: crosscheck of Directional feature calculation on subset


of pixels [K. Sugimoto, K. Miyazawa, A. Minezawa, S. Sekiguchi
(Mitsubishi)]

JCTVC-G650 CE8 Subtest a: Cross-check of Tool 2 (JCTVC-G609) Directional feature


calculation on subset of pixels [P. Lai, F. C. A. Fernandes (Samsung)]

JCTVC-G647 CE8 Subtest a, Tool 3: Block-based filter adaptation with up to 8 filters


(HV8) [P. Lai, F. C. A. Fernandes, I.-K. Kim (Samsung)]
This contribution presents filter adaptation design with unified and reduced number of initial filter classes
for both block-based and region-based filter adaptations (BA and RA). For BA, 2 directional classes are
produced by directly comparing directional features, each directional class has 4 magnitude levels,
leading to 8 initial filter classes; while HM4.0 has 3 directional classes with 5 magnitude levels (15 initial
classes). RA is modified to partition a frame into 8 regions corresponding to 8 initial filter classes. Such
design reduces the number of testing in filter-class merging process at encoder, and unifies the syntax of
merging flags under BA and RA.
The proposed method reports 0.1% BD-rate loss for AI, RA, LDB structures; and 0.2% BD-rate loss for
LDP structure. The enc / dec time have no measurable changes as compared to HM4.0.

JCTVC-G611 CE8.a.3: Cross check of Samsung's directional feature calculation on subset


of pixels [I. S. Chong, M. Karczewicz (Qualcomm)]

Subtest B

JCTVC-G498 CE8: ALF with low latency and reduced complexity [A. Fuldseth, G.
Bjøntegaard (Cisco)]
The document describes a low complexity ALF technique suitable for low latency applications. One
single set of ALF filter coefficients are computed quantized and transmitted sequentially for each block
using a single pass technique. The proposed ALF also has low complexity by using only a 5x5 diamond
shape and no decoder-side variance calculations. When applied to low complexity configurations, BD-
rate gains between 1.4 % and 3.7% are reported. When applied to high efficiency configurations, BD-rate
losses between 0.5% and 1.1% are reported.

Page: 57 Date Saved: 2011-12-04


JCTVC-G371 CE8 Subset b: cross-check for Cisco’s ALF (JCTVC-G498) [J. Xu, A.
Tabatabai (Sony)]

JCTVC-G864 CE8 Subset b.1: Cross check of Cisco's ALF with low latency and reduced
complexity [I. S. Chong, M. Karczewicz]

Subtest C

JCTVC-G564 CE8 subtest c tool 1: Line memory reduction for ALF and SAO decoding [S.
Esenlik, M. Narroschke, T. Wedi (Panasonic)]
This contribution is a part of CE8 on in-loop filtering. Proposed is a method to reduce the line memory
which is required by consecutive filtering operations in the decoder. In the current HM 4.0, Deblocking
Filter (DF), Sample Adaptive Offset (SAO) and Adaptive Loop Filter (ALF) all pose difficulties related
to storage requirements in the block-based decoding procedure. Namely, for the purpose of filtering
across the boundaries of LCUs (Largest Coding Unit), horizontal and vertical line memory need to be
employed which is increases the implementation complexity of decoder chips. This contribution focuses
on the reduction of the line memory for LCU-based decoding. The main focus is the reduction in the so
called horizontal line memory, whose size is directly proportional to the width of the decoded picture.
With the help of the proposed technique the horizontal line memory that needs to be employed is reduced
from 9 lines to 5 lines for the luminance component and from 7 lines to 4 lines for the chrominance
components.

JCTVC-G051 CE8 Subset 3: Cross-check of Panasonic’s line memory reduction for in-loop
filtering (JCTVC-F272) [S. Park, S. Lee, N. Eum (ETRI)]

JCTVC-G479 CE8.c.1: Crosscheck for Panasonic's line memory reduction for ALF and
SAO decoding in JCTVC-G564 [C.-Y. Chen, Y.-W. Huang (MediaTek)]

JCTVC-G204 CE8.c.2: Single-source SAO and ALF virtual boundary processing [C.-M.
Fu, C.-Y. Chen, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE8.c.2. In HM-4.0, SAO requires 0.2 luma pixel line buffers (PLB)
and 0.2 chroma PLB, and ALF requires 4.1 luma PLBs and four chroma PLBs for practical real-time
decoders. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing was proposed in order to
achieve zero line buffer and good visual quality for SAO and ALF. Due to the DF in HM-4.0, luma VBs
and chroma VBs are set as four and two pixels above horizontal LCU boundaries, respectively. For a to-
be-processed pixel on one side of a VB, any pixel on the other side of the VB is avoided by modifying
pixel classification for SAO and filter shapes for ALF. When compared with the JCTVC-F900 anchor, the
proposed method reportedly causes 0.1%, 0.2%, 0.3%, and 0.4% coding efficiency losses for HE-AI, HE-
RA, HE-LDB, and HE-LDP, respectively, and is claimed to have similar visual quality as the anchor. VB
artifacts can only be seen in few pictures.

JCTVC-G559 CE8 Subtest c: Cross-Check report for JCTVC-G204 [S. Esenlik, A. Kotra,
M. Narroschke(Panasonic)]

Page: 58 Date Saved: 2011-12-04


JCTVC-G614 CE8 Subset c.2: Cross-Verification of MediaTek’s Line memory reduction in
ALF and SAO [I. S. Chong, M. Karczewicz (Qualcomm)]

JCTVC-G205 CE8.c.3: Multi-source SAO and ALF virtual boundary processing [C.-Y.
Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek), S. Esenlik, M.
Narroschke, T. Wedi (Panasonic)]
This contribution reports results of CE8.c.3. In HM-4.0, SAO requires 0.2 luma pixel line buffers (PLBs)
and 0.2 chroma PLBs, and ALF requires 4.1 luma PLBs and four chroma PLBs for practical real-time
decoders. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing was proposed to remove
all these line buffers. In JCTVC-F272, partial use of pre-DF pixels as SAO and ALF inputs was proposed
to reduce line buffers. In order to achieve zero line buffer and good visual quality for SAO and ALF, the
two methods are combined as follows. Due to the DF in HM-4.0, luma VBs and chroma VBs are set as
four and two pixels above horizontal LCU boundaries, respectively. For to-be-processed pixels above the
VB, any required pixel below the VB is replaced by a pre-DF pixel. For to-be-processed pixels below the
VB, any pixel above the VB is avoided by modifying pixel classification for SAO and filter shapes for
ALF. When compared with the JCTVC-F900 anchor, the proposed method reportedly causes 0.1%, 0.1%,
0.1%, and 0.2% coding efficiency losses for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and is
claimed to have similar visual quality as the anchor. Very minor VB artifacts can only be seen in very few
pictures.

JCTVC-G234 CE8.c.3: Cross-verification of MediaTek's and Panasonic's proposal on


multi-source SAO and ALF virtual boundary processing (JCTVC-G205) [M.
Matsumura, S. Takamura, H. Jozawa (NTT)]

JCTVC-G206 CE8.c.4: SAO and ALF virtual boundary processing with cross9x9 [C.-Y.
Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek), S. Esenlik, M.
Narroschke, T. Wedi (Panasonic)]
This contribution reports results of CE8.c.4-1 and CE8.c.4-2, which are a combination of CE8.c.2 using
single-source SAO and ALF virtual boundary (VB) processing and CE8.d.1 using cross9x9 and
snowflake5x5 and a combination of CE8.c.3 using multi-source SAO and ALF VB processing and
CE8.d.1, respectively. When compared with the JCTVC-F900 anchor, CE8.c.4-1 reportedly achieves
0.0%, -0.2%, -0.3%, and 0.1% BD-rates for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and
CE8.c.4-2 reportedly achieves 0.0%, -0.2%, -0.4%, -0.1% BD-rates for the four conditions respectively,
where negative numbers mean gains and positive numbers mean losses. The gain of using cross9x9 is
roughly unchanged when CE8.c.2 and CE8.c.3 are considered. It is also reported that both CE8.c.4-1 and
CE8.c.4-2 have subjective qualities close to HM-4.0, and CE8.c.4-2 is better than CE8.c.4-1.

JCTVC-G433 CE8.c.4: Cross-check of MediaTek/Panasonic's SAO and ALF virtual


boundary processing with cross9x9 JCTVC-G206 [T. Yamakage, T.
Watanabe (Toshiba)]

JCTVC-G207 CE8.c.5: Non-cross-slices SAO [C.-M. Fu, C.-Y. Tsai, C.-Y. Chen, Y.-W.
Huang, S. Lei (MediaTek), M. Budagavi (TI)]
This contribution reports results of CE8.c.5. In HM-4.0, non-cross-slices SAO skips each to-be-processed
pixel requiring any pixel from any other slice. However, the skipping technique may cause some potential
problem in visual quality. In JCTVC-F093 and JCTVC-F232, any pixel from any other slice is avoided by

Page: 59 Date Saved: 2011-12-04


using a padding technique and by modifying pixel classification patterns, respectively. In this proposal, it
is found that the padding technique and modification of pixel classification patterns are conceptually the
same and can be combined into one single solution. Simulation results reportedly show that the proposed
method achieves no coding efficiency loss and better subjective quality in comparison with the anchor
when non-cross-slices SAO and multiple slices per picture are used.

JCTVC-G414 CE8.c.5: Cross-check of MediaTek/TI's non-cross-slices SAO JCTVC-G207


[T. Yamakage, T. Watanabe (Toshiba)]

Subtest D

JCTVC-G208 CE8.d.1: Snowflake5x5 and cross9x9 for luma and chroma ALF shapes [C.-
Y. Tsai, C.-Y. Chen, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), I. S. Chong,
M. Karczewicz (Qualcomm), T. Yamakage, T. Watanabe, T. Chujoh
(Toshiba)]
This contribution reports results of CE8.d.1. In HM-4.0, snowflake5x5 and cross11x5 filter shapes have
nine and eight multiplications, respectively, and are used for both luma and chroma in ALF. In this
proposal, snowflake5x5 and cross9x9 are used, and they both have nine multiplications to better utilize
multipliers without increasing the number of multipliers in hardware. Simulation results reportedly show
0.1%, 0.4%, 0.6% and 0.3% coding efficiency gains for HE-AI, HE-RA, HE-LDB, and HE-LDP,
respectively, with roughly the same encoding time and 1%-2% decoding time increase. Apparently
cross9x9 needs more line buffers than cross11x5, so it is suggested to combine this proposal with ALF
line buffer removal techniques.

JCTVC-G133 CE8: Crosscheck of JCTVC-G208 - Snowflake5x5 and cross9x9 for luma


and chroma ALF shapes [M. Budagavi (TI)] [late]

JCTVC-G648 CE8 Subtest d, Tool 2: ALF filters with 9 coefficients and up to vertical-size
7 [P. Lai, F. C. A. Fernandes (Samsung), H. Guermazi, F. Kossentini,
M.Horowitz (eBrisk)]
This contribution presents ALF method using two filter shapes both having 9 coefficients: star-5x5 and
cross-11x7. Compared to the previously adopted proposal JCTVC-F303, one coefficient has been added
to construct cross-11x7, and extended its vertical-size to 7. The proposed method reports gains of
0.1/0.2/0.3/0.2 BD-rate for AI/RA/LDB/LDP structures as compared to HM4.0, with average of 2% / 1%
enc / dec time increases on Linux cluster server.

JCTVC-G613 CE8 Subset d.2: Cross-Verification of Samsung’s Filter Shapes and


Coefficients constraints [I. S. Chong, M. Karczewicz (Qualcomm)]

JCTVC-G130 CE8 Subtest d - Chroma ALF with reduced vertical filter size [M. Budagavi,
V. Sze, M. Zhou (TI)]
This contribution presents results for Nx3 chroma ALF filters when integrated into HM 4.0. Nx3 chroma
ALF filters have a vertical size of 3 when compared to filters in HM 4.0 which have vertical size of 5.
The filter set of 7x3 diamond + 11x3 cross with 3x3 center is reported to have following BD-Rate results
(Y, U, V): AI-HE: 0.0%, 0.4%, 0.5%; RA-HE: 0.0%, 0.8%, 0.5%; LB-HE: 0.0%, 0.7%, 0.5%;LP-HE:
0.0%, 0.7%, 0.4%. The worst case number of multiplications for this filter set is reported to be the same

Page: 60 Date Saved: 2011-12-04


as for HM-4.0 filter set. Results for other Nx3 filter sets that have vertical size of 3 are also presented.
Filters with reduced vertical size are asserted to be useful from the point of view of reducing number of
line buffers required and/or external memory bandwidth. Nx3 filters are claimed to result in a 50%
reduction in line buffer/external memory bandwidth requirements compared to HM-4.0 chroma ALF
filters.

JCTVC-G652 CE8 Subtest d: Cross-check of Tool 3 (JCTVC-G130): Chroma ALF with


reduced vertical filter size, proposed by TI [P. Lai, F. C. A. Fernandes
(Samsung)]

Subtest E

JCTVC-G665 CE8.e.3: ALF coefficient prediction [K. Andersson (Ericsson)]


This contribution is a core experiment report on JCTVC-F076 which exploits that adaptive loop filter
coefficients typically are largest towards the middle of the filter. The BDR can be improved by 0.1%
compared to HM-4.1 for random access and low delay for the common conditions at similar encoding and
decoding time.

JCTVC-G928 CE8.e.3: Cross-check of Ericsson's ALF coefficient prediction JCTVC-G665


[T. Yamakage, T. Watanabe (Toshiba)] [late]

Subtest F

(original proposal withdrawn)

JCTVC-G060 CE8 Subtest F: Verification results of Panasonic's Proposal [F. Kossentini,


H. Guermazi (eBrisk Video Inc.)]

Subtest G

JCTVC-G056 CE8 Subtest G: Adaptive Loop Filtering of Chrominance Samples Using


Luma Map [F. Kossentini, H. Guermazi, N. Mahdi, M. Horowitz (eBrisk
Video Inc.)]
This contribution presents a technique for the Adaptive Loop Filtering (ALF) of chrominance samples
that employs the map used in ALF of the luminance samples. The proposed technique reduces the average
decoding time (~1%) while also achieving better U/V BD-Rate (0.21%-0.19%) performance than those of
HM4.0.

JCTVC-G480 CE8.g.1: Crosscheck for eBrisk's adaptive loop filtering of chrominance


samples using luma map in JCTVC-G056 [C.-Y. Tsai, Y.-W. Huang
(MediaTek)]

Page: 61 Date Saved: 2011-12-04


Subtest H

JCTVC-G235 CE8.h: CU-based ALF with non-local means filter [M. Matsumura, S.
Takamura, H. Jozawa (NTT)]
This contribution reports the performance of a technique that utilizes a denosing filter as the in-loop filter
of HM codec. In the proposed method, a denoising filter called non-local means filter is unified into CU-
based adaptive loop filter of HM4.0.
Compared to the anchor of HM4.0, the average BD-rate gains were –0.1, –0.2, –0.3, and –0.5% for Intra,
Random Access, Low Delay B, and Low Delay P, respectively. The average decoding time increased 2 to
5%. The maximum gain was –1.4% in Low Delay P for the sequence “BasketballDrive”.

JCTVC-G299 CE8 Subtest h: Cross verification of NTT’s CU-based ALF with NLM filter
(JCTVC-G235) by Intel [Y. Chiu, L. Xu (Intel)]

JCTVC-G482 CE8.h.1: Crosscheck for NTT's CU-based ALF with non-local means filter
in JCTVC-G235 [C.-Y. Tsai, Y.-W. Huang (MediaTek)]

4.8.3 Discussion and Conclusions

4.9 CE9: MV coding and skip/merge operation

4.9.1 Summary

JCTVC-G039 CE9: Summary Report of Core Experiment on MV Coding and Skip/Merge


Operations [B. Bross, J. Jung, W.-J. Chien, I.-K. Kim, M. Zhou (CE
coordinators)]
In this CE report test results for the five different categories Simplification, Modification of the AMVP
candidate selection, Parallel merge/skip, Additional merge mode and Merge/Skip operations for identical
motion are reported.
SP01..04 no need to discuss (according to proponents)
Tentative recommendations (as from report)
 Discuss in BoG what is simplified by AMVP_SEL01/02 compared to HM4.0 and what causes the
high losses in AMVP_SEL03/04 for some sequences. (During discussion in track A, SEL03/04 is
said to include a bug which should be further investigated. Doubt that SEL01/02 is really a
simplification)
 Recommend MRG_PAR pending on a clarification. Clarification was not confirmed at the time,
this report was uploaded. (see discussion on G085 below)
 Recommended to further study the proposed additional merge mode technique in MRG_MVD
tests. MRG_MVD01 is simplest version (encoder complexity increased by checking the
additional mode), whereas the other extensions add some more encoder complexity with some
additional gain. No support by other companies.

Page: 62 Date Saved: 2011-12-04


Discussion on G085 (MRG_PAR): Some concern which would be the correct reference of comparison.
Comparison is made versus modified HM anchor where parallel motion estimation is enabled such that
RD opt decision cannot be correctly performed. Compared to actual HM anchor, the bitrate increase is
roughly 0.7% for LCU 16x16 and 1.5% for LCU 32x32, even more for 64x64.
Note: G164, G387, G416 are related but are said to provide less penalty by the parallel approach. See
conclusion under JCTVC-G164.

4.9.2 Contributions

JCTVC-G052 CE9: Results of MRG_MVD series [S. Fukushima, M. Ueda, K. Arakage, S.


Sakazume (JVC Kenwood)]
The CE9 was established to evaluate the performance of motion vector coding and skip/merge operations
at 6th JCT-VC meeting. Merge base MVD transmission was proposed at the meeting. This contribution
reports the results of merge base MVD transmission under HM4.0 common test conditions.
The simulation results report that merge base MVD transmission provides 0.5% BD-rate gain for random
access settings and 0.8% gain for low delay B settings under HM4.0 common test conditions.

JCTVC-G179 CE9: Cross-check report of MRG_PAR03 and 04 on Parallel ME [S. Park,


B. Jeon (LGE)]

JCTVC-G190 CE9: Cross-verification result of MRG_MVD01 [K.Kazui (Fujitsu)]

JCTVC-G536 CE9: Cross-check report of MRG_MVD02 by Panasonic [T. Sugio, T.


Nishi(Panasonic)]

JCTVC-G688 CE09: Crosscheck report of JCTVC-G52 test MRG_MVD03 [Y. Zheng, X.


Wang (Qualcomm)]

JCTVC-G053 CE9: Results of AMVP_SEL01 and AMVP _SEL02 [H. Nakamura, S.


Fukushima, M. Nishitani, H. Takehara (JVC Kenwood)]
This contribution reports the test results of CE9 AMVP_SEL01 and AMVP_SEL02. The simulation
results report that AMVP_SEL01 and AMVP_SEL02 provide no coding loss for both random access
settings and low delay B settings.
Was presented in the plenary Sat morning.
Claim is that it simplifies the memory access and allows parallel processing.
Several experts express opinion that there is no advantage compared to the current design of HM4.0,
where neither memory access nor parallel processing are an issue in this specific context

JCTVC-G084 CE9: Test results on SP01, SP02, SP03 and SP04 [M. Zhou (TI)]
This document reports CE9 test results on SP01, SP02, SP03 and SP04 which are related to temporal
MVP. Test results reveal that disabling the centre TMVP position from both the merge/skip and AMVP
MVP list derivation process leads to a loss of 0.1% in all configurations(SP01); removing TMVP from

Page: 63 Date Saved: 2011-12-04


the AMVP list derivation process causes an average loss of 0.3% in RA-HE, 0.4% (0.3%) in RA-LC,
0.6% in LB-HE and 0.7% in LB-LC (SP03, SP04); and disabling TMVP from both the merge/skip and
AMVP MVP list derivation process leads to an average loss of 2.4% in RA-HE, 2.6% in RA-LC, 3.4% in
LB-HE and 4.0% in LB-LC (SP04).

JCTVC-G261 CE9: cross-check of experiment SP01 – Remove TMVP center position


(JCTVC-G084) [J. Jung (Orange Labs)]

JCTVC-G706 CE9: Cross-check report for TI's JCTVC-F083 by Motorola Mobility [Y.
Yu, K. Panusopone, L. Wang (Motorola Mobility)]

JCTVC-G689 CE09: Crosscheck report of JCTVC-G084 test SP04 [Y. Zheng, X. Wang
(Qualcomm)]

JCTVC-G085 CE9: Test results on parallelized merge/skip mode [M. Zhou (TI)]
This document reports CE9 test results on parallel merge/skip mode. The current HEVC merge/skip mode
design is highly sequential and introduces dependency among neighboring PUs, which can lead to
significant quality loss if motion estimation (ME) is performed in parallel for throughput or
implementation cost reasons. For typical parallel ME level of 32x32, the measured average loss is 5.0% in
RA-HE, 5.3% in RA-LC, 6.7% in LB-HE and 7.8% in LB-LC. The loss is caused by fact that the
merge/skip mode cannot be tested for those PUs inside the 32x32 block whose neighboring motion data
are still unavailable during the parallel processing process. It is proposed to add a high-level syntax
element to signal the parallel level of merge/skip mode, divide a LCU into parallel motion estimation
regions (MERs) and allow only those neighboring PUs which belong to different MERs from the current
PU to be included in the merge/skip MVP list construction process. Simulation results reveal that an
average gain of 3.4% in RA-HE, 3.5% in RA-LC, 4.4% in LB-HE and 5.0% in LB-LC can be achieved
for 32x32 block level parallel ME when compared to the current HM4.0 design, and for parallel level
16x16 that is used today, the average gain is 1.9% in RA-HE, 1.8% in RA-LC, 2.6% in LB-HE and 2.5%
in LB-LC. The proposed design is backward compatible to the current design but offers flexibility for
high throughput and high quality encoder designs.

JCTVC-G404 CE9: Cross-check of parallelized merge/skip mode (JCTVC-G085) [B. Li


(USTC), J. Xu (Microsoft)]

JCTVC-G439 CE9 subtest 2.6: Results of MRG_ID01 in Merge/Skip operations for


identical motion [A. Tanizawa, T. Shiodera, T. Chujoh, T. Yamakage
(Toshiba)]
This contribution reports results of MRG_ID01 described in JCTVC-F325 in CE9 subtest 2.6
(Merge/Skip operations for identical motion). MRG_ID01 modified derivation process of temporal
merge/skip mode in B-slice and has two kinds of variations. The variation A means that the collocated
picture for L1 motion vector is changed from RefPicList1[0] to RefPicList1[1], if the two reference
frames for the collocated MV candidates are identical. The variation B means that the same derivation
process as the above is performed if both reference frames and MVs of the collocated merge candidates
are identical.

Page: 64 Date Saved: 2011-12-04


Experimental results show that BD-bitrate of the variation A is 0.0%, 0.0%, 0.3% and 0.3% for RA-HE,
RA-LC, LB-HE and LB-LC, respectively. The positive value means loss. Average decoding times are
100%, 100%, 99% and 99% for RA-HE, RA-LC, LB-HE and LB-LC, respectively. BD-bitrate of the
variation B is 0.0%, 0.0%, 0.1% and 0.1% for RA-HE, RA-LC, LB-HE and LB-LC, respectively.
Average decoding times are 99%, 100%, 99% and 100% for RA-HE, RA-LC, LB-HE and LB-LC,
respectively.

JCTVC-G422 CE9: Cross-check for MRG_ID01 (JCTVC-G439) [H. Y. Kim (ETRI), K. Y.


Kim, S. M. Kim, G. H. Park (KHU), S.-C. Lim, J. Lee (ETRI)]

JCTVC-G421 CE9: Results of Experiments MRG_ID02, MRG_ID03, and MRG_ID04 [H.


Y. Kim (ETRI), K. Y. Kim, S. M. Kim, G. H. Park (KHU), S.-C. Lim, J. Lee
(ETRI)]
This contribution reports the results of CE9 experiment MRG_ID02, MRG_ID03, and MRG_ID04.
Experimental results showed that an average loss of 0.1% ~ 0.2% for Low Delay test condition is
observed.

JCTVC-G745 CE9: Cross-check report on MRG_ID02 A and B (JCTVC-G421) [I.-K. Kim


(Samsung)]

JCTVC-G161 CE9: cross-verification of ETRI proposal on merge/skip operations for


identical motion (MRG_ID03 A, MRG_ID03 B) [Y. Jeon, B. Jeon (LG)]

JCTVC-G437 CE9 subtest 2.6: Cross-checking reports of MRG_ID04 (ETRI/KHU’s


proposal) [A. Tanizawa, T. Shiodera (Toshiba)]

JCTVC-G702 CE9: Simplification of MVP Design for HEVC [Y. Yu, K. Panusopone, L.
Wang (Motorola Mobility)]
This document reports the results of Motorola Mobility’s simplification of MVP design for HEVC. They
are AMVP_SEL03 and AMVP_SEL04 tests specified in the CE9. Simulation results show that there is a
no loss for low delay and 0.2% loss for random access conditions compared to original AMVP while the
complexity of the proposed method is reduced by half as compared to the MVP selection procedure of
AMVP.

JCTVC-G513 CE9: Cross-check result of Sel01&02 [K. Sato (Sony)]

JCTVC-G101 CE9 subset 2.3: cross-verification of MMI’s proposal on AMVP


simplification Sel03 & Sel04 [M. Zhou (TI)]

Page: 65 Date Saved: 2011-12-04


4.9.3 Discussion and Conclusions

4.10 CE10: Core transforms

4.10.1 Summary

JCTVC-G040 CE10: Summary Report of Core Experiment on Core Spatial Transforms


[P. Topiwala, M. Budagavi, R. Joshi, E. Alshina (CE coordinators)]
This CE investigates the design of core spatial transforms, to study the tradeoff between coding efficiency
and complexity, return evidence and a recommendation for design selection. The anchor for comparison
are the transforms in the HM.
Current proposals:
 FastVDO/Samsung (G266): FF only, invertibility for 4, 8, 16 length (same as G737 for 32
length).
 Qualcomm (G579): FF only
 Cisco/TI (G495): MM or PB, some asserted property for sharing forward and inverse processing
 Samsung/FastVDO (G737): MM, PB or FF
All but G266 are the same for 4x4.
Power consumption was noted to be especially important for hardware.
Pruning properties can be important (see G628).
In PB operation, G495 uses 8 b coeffs, G737 uses 14 b coeffs.
For reasons of desiring flexibility, we can remove G266 and G579 from consideration.
In the CE, software and hardware implementation complexity were studied.
Regarding reported analysis from Altera, it was noted that the 4 point comparison in Tables 1 and 2 of the
reported analysis does not seem to make objective sense, because identical transforms seem to not have
identical reported complexity. It was also remarked that an FPGA implementation would likely use MAC
operations that were assumed not to be used in the reported results. The person who performed the
analysis said this was because the number of available MACs on some devices is limited, so this analysis
was done in a way that avoided them.
A reported hardware analysis was reported from ST Micro (G887) based on automatic RTL generator, a
proponent suggested to emphasize lower area implemenatation, and noted that the data actually showed
lower latencies with lower areas; that it would not make sense to increase latency if that increases the
area.
Additional hardware information was provided from TI.
The submitted hardware implementation information seemed to favor G737. However, there was some
(non-proponent) questioning of the validity of the results – e.g. due to dependency on the particular
implementation of what was tested, and it was acknowledged that there is some dependency on this.
G628 (from Cisco/TI) describes some aspects that are asserted to favour G495 over G737
 8 b versus 14 b coefficients
 a symmetry property relative to the main diagonal that makes sharing of processing elements
feasible for the forward and inverse transform
 pruning advantage for software

Page: 66 Date Saved: 2011-12-04


 comments about a reduced number of unique coefficient values
 saying that G737 would sometimes exceed a 32-bit range (encoder and decoder)
It was remarked that there is an input document from MIT, discussing the implications of having a
reduced number of unique coefficients – estimating a 25% reduction in codec complexity being feasible
from that. This property was not used in some of the hardware implementation analysis.
A proponent of G737 questioned some of this reported information – e.g. regarding the number of unique
coefficient values. The proponent of G495 indicated that this could be addressed by clipping. However,
this would affect the ability to represent some valid input values for the 32x32 transform.
It seemed to be indicated that for hardware implementation, the FF approach is the only one that really
makes the most sense for G737 – the PB approach would not make sense. In the FF case, the number of
unique coefficient values was suggested to be similar.
The proponent of G737 said that with the FF approach, the sharing of processing elements should also be
feasible with that approach. However, the information available about this seemed limited.
Some SIMD information was presented by the proponent of G737. However, for the PB part, it was
remarked that some additional information is in G757 describing a real-time decoder that seemed to
indicate better performance for the PB approach for G495.
Revisit.
Alternative notes (to be merged):
At low QP (1) differences between the proposals are up to 1% on average. In the usual common settings,
no remarkable differences (0.1%).
G266 not invertible for 32x32.
None of the transforms for the case of matrix multiply has the transpose as exact inverse transform (i.e.
giving identity matrix when multiplied by transpose), but they have different large deviations (which is
equivalent of being more or less orthogonal)
Visual quality was not checked but is assumed to be irrelevant
Difference between full factorization and partial butterfly of inverse transform becomes low when many
zero coefficients are present.
Full factorization only is not desirable? Matrix multiplication is good for SIMD and should also be
supported. Remove G266 and G579 as they do not have this flexibility.
G495 uses 8 bit coefficients in partial butterfly, G737 uses 14 bit coefficients. 8 bit gives advantage both
for hardware and software.
Table 1/2 of CE10 report (FPGA implementation by Altera): For length-4, HM transform (G495) and
G737 should be the same, but they get different numbers. Were mult/add engines that are usually on
FPGAs used? No, only add/shift (the particular FPGA used did only have 56 SP blocks)
Table 3/4 of CE10 report (gate count based on automatic generation from C code to RTL): Is it useful to
look at different latency? Usually a latency would be selected that gives smallest design, i.e. smallest gate
count (it is confirmed by STM who provided this analysis that in this case looking at higher latency
values is irrelevant). The numbers show an advantage of G737 in gate count. There is however some
debate about the amount of optimization that could still be done in the translation of C to RTL. It is
confirmed by D Alfonso that usually further manual optimization is necessary to get optimum results.
G495 has “symmetry property” with same coefficients in different basis functions and smaller transform
such as 4x4 being subset of 8x8 (also in 8bit/14bit integer approx. of matrix) which is also advantageous
for partial butterfly. Number of unique multiplier coefficients is 29/171 for 8bit/14bit cases, respectively.
G737 partial butterfly would not be useful for hardware (whereas full factorization can be implemented in
8bit) – question: can same transform then be used for inverse?

Page: 67 Date Saved: 2011-12-04


32 bit precision? Theoretically, overflow could happen if the input of inverse transform does not come
from forward transform.
G737 has number of unique multipliers in 32x32 being 30, but that relates to full factorization and is
higher in matrix multiply.

4.10.2 Contributions

JCTVC-G266 CE10: Lossless Core Transforms for HEVC [W. Dai, M. Krishnan, J.
Topiwala, P. Topiwala (FastVDO), E. Alshina (Samsung)]

JCTVC-G694 CE10: Cross-check report for FastVDO/Samsung's core Transform by


Motorola Mobility (JCTVC-G266) [J. Lou, L. Wang (Motorola Mobility)]

JCTVC-G737 CE10: Full Factorization Core Transforms for HEVC [E. Alshina, A. Alshin,
W. Lee, J. Park, K. Pachauri (Samsung), P. Topiwala (FastVDO)]

JCTVC-G863 CE10: Crosscheck of FastVideo/Samsung core transforms for high and low
QP range [Rajan Joshi] [late]

JCTVC-G495 CE10: Core transform design for HEVC [A. Fuldseth, G. Bjøntegaard
(Cisco), M. Budagavi (TI)]

JCTVC-G579 CE10: Scaled integer transforms supporting recursive factorization


structure [R. Joshi, J. Sole, M. Karczewicz (Qualcomm)]

JCTVC-G953 CE10: Cross-check of JCTVC- G579 core transform - low and high QP
range [M. Budagavi (TI)] [late]

JCTVC-G819 CE10: Cross check for core transform proposed by Qualcomm by Samsung
[E. Alshina, J.H. Park] [late]

JCTVC-G887 CE 10: hardware test of inverse transform proposals [Sumit Johar, Daniele
Alfonso (STM)] [late]

Page: 68 Date Saved: 2011-12-04


4.10.3 Discussion and Conclusions

4.11 CE11: Coefficient scanning and coding

4.11.1 Summary

JCTVC-G041 CE11: Summary report of Core Experiment on coefficient scanning and


coding [V. Sze, J. Chen, T. Nguyen, K. Panusopone, J. Sole (CE
coordinators)]
This contribution is a summary of core experiment 11, Coefficient scanning and coding. Fourteen
companies have been registered in CE11, and eight tools from seven proposals have been evaluated on
the condition defined in document by the Software Ad Hoc Group. An eighth proposal was withdrawn; a
ninth proposal (G301) released software for cross-check more than 6 weeks after deadline and it was
agreed that it would be tested outside of CE.
Various methods have been proposed for coding the transform coefficients of the residual signal to reduce
complexity and/or improve the coding efficiency. This core experiment evaluates the coding efficiency
and complexity impact of:
 Context modeling/selection for syntax elements related to transform coefficients in HE
configuration
 Transform coefficient scanning order methods for CAVLC and/or CABAC.

G121 Change c1 and c2 thresholds in context selection of coeff_abs_level_greater1_flag &


coeff_abs_level_greater2_flag
Reduces the number of contexts by 36. No change of logic.
Deferred previously from concern regarding low QP and RDOQ off. That aspect was checked and no
problem was found. 0.07% degradation in AI, 0.0 in others.
Decision: Aside from a desire to consider competing proposals, this seemed generally supported.
Revisit Was later discussed again in relationship to other non-CE inputs and was adopted, as noted
elsewhere.
G321 Change context selection of significant_coeff_flag to depend on scan position rather than block
position
Change last_significant_coeff_x and last_significant_coeff_y to always assume diagonal scan
Removes a dependency on determining the intra prediction mode. A loss of 0.2% for AI, 0.1% for
RA, and 0.1%. It was asked whether the complexity savings is substantial, and it did not seem very
substantial.
G320 Change ordering/grouping of significant_coeff_flag and coefficient level information.
Alternatives: “Group by 16” or “group by TU”. No quality difference was measured.
In the current design, there doesn’t seem like a strong argument here for changing the current design
to operate this way.
G703 Additional scans for intra coded 16x16/32x32 TU
Adds more scans – bit rate savings 0.1 for AI HE, 0.0 for other cases. Gain doesn’t seem sufficient to
justify the change.

Page: 69 Date Saved: 2011-12-04


G284 Additional scans for intra coded 16x16/32x32 TU
Adds more scans – bit rate savings 0.2 for AI HE, other cases not tested. Gain doesn’t seem sufficient
to justify the change.
G679 Use horizontal and vertical scans on intra coded 16x16 and 32x32 TU
CAVLC specific – not reviewed
G269 Add horizontal and vertical scans to inter coded TU for NSQT
CAVLC specific – not reviewed

4.11.2 Contributions

JCTVC-G121 CE11: Reduction in contexts used for coefficient level [V. Sze (TI)]

JCTVC-G327 CE11: Cross-check of TI’s reduction in contexts used for coefficient level
(JCTVC-G121) [J. Sole (Qualcomm)]

JCTVC-G269 CE11 Report on Prediction Unit Dependent Coefficient Scanning For Inter
Frame [J. Song, X. Zheng, H. Yang, H. Yu (Huawei)]

JCTVC-G708 CE11:Cross-check report for Huawei's JCTVC-F501 by Motorola Mobility


[Y. Yu, K. Panusopone, L. Wang (Motorola Mobility)]

JCTVC-G792 Cross-check for Huawei’s proposal on PU dependent coefficient scan for


inter block (JCTVC-G269) [Y. Piao, J. Chen, J. Min, J.H. Park (Samsung)]

JCTVC-G284 CE11: Extended Mode Dependent Coefficient Scanning [X. Zhao, X. Guo, S.
Lei (MediaTek), S. Ma, W. Gao (PKU)]

JCTVC-G364 CE11: Crosscheck of JCTVC-F124 on Extended Mode-Dependent


Coefficient Scanning from MediaTek [C. Auyeung (Sony)]

JCTVC-G320 CE11: Scanning of Residual Data in HE [J. Sole, R. Joshi, M. Karczewicz


(Qualcomm)]

Page: 70 Date Saved: 2011-12-04


JCTVC-G412 CE11: Cross-check result of Significance Map interleaving part of
Qualcomm proposal on Parallel Processing of Residual data in HE (G320)
[C. Rosewarne, M. Maeda (Canon)]

JCTVC-G321 CE11: Removal of the parsing dependency of residual coding on intra mode
[J. Sole, Y. Zheng, W.-J. Chien, R. Joshi, X. Wang, M. Karczewicz
(Qualcomm)]

JCTVC-G124 CE11: Cross-check of proposal on Removal of the parsing dependency of


residual coding on intra mode G321 (F550) [V. Sze (TI)]

JCTVC-G302 CE11: Cross-check report for CE11 B2 Qualcomm's proposal on Removal of


the parsing dependency of residual coding on intra mode [H. Sasai, T. Nishi
(Panasonic)]

JCTVC-G679 CE11: Extending horizontal and vertical scan to big block for CAVLC [M.
Karczewicz, Y. Zheng, L. Guo, X. Wang (Qualcomm)]

JCTVC-G703 CE11: Adaptive Scan for Large Blocks for HEVC [Y. Yu, K. Panusopone, J.
Lou, L. Wang (Motorola Mobility)]

JCTVC-G077 CE11: Cross-check of CE.B1 - Scans for large blocks in CAVLC [C. Yeo
(I2R)]

JCTVC-G975 CE11: Crosscheck - Adaptive Scan for Large Blocks for HEVC (G703) [T.
Nguyen (Fraunhofer HHI)] [late]

4.11.3 Discussion and Conclusions

4.12 CE12: Deblocking filter

4.12.1 Summary

JCTVC-G042 CE12: Summary report of Core Experiment on deblocking filtering [A.


Norkin, X. Guo, B. Jeon, M. Narroschke (CE coordinators)]
This contribution is a summary report of Core Experiment 12 on deblocking filtering. The goal of the
proposals is to enhance deblocking filtering techniques in the HEVC Test Model. This is not limited to

Page: 71 Date Saved: 2011-12-04


improving coding efficiency and subjective quality but also reducing complexity and harmonization of
the various schemes that are technically feasible. The CE comprises five subtests. All the proposals are
based on HM4.0 and experiments are performed according to the common test conditions provided in
JCTVC-F900. Additionally, proposals from subtest 5 provide results for higher QPs. All results are
verified by cross-checkers.

Subtest 1: Subjective and objective quality improvements


G286/287: Adding filters for chroma, the first using separate decisions, the other re-using luma decision
G409: Using stronger (longer) filters
G590: Taking separate decisions for each half (4 lines/columns) of a length-8 block boundaries, decision
is not taken per line/column, but rather based on two lines/columns (first/fourth) for the entire half. (in
total, complexity reduction is claimed)
G383: specific filtering (weaker) for AMP blocks 4x16/16x4. Basically, same filter as HM.
G087: Uses delta calculation of HM3, done same way for luma and chroma.
Visual tests to be performed on G287, G409, G590 and G383. As G287, G409 and G383 increase
complexity (G383 is at least less regular), it would be expected that visual improvement is shown. G590
is reducing complexity, which should not be bought by additional artifacts. Was discussed again after
tests had been performed, which unveiled no subjective quality difference for any proposal.
Note: Proposal G089 is similar to G590 but uses parallel decision.
Decision: Adopt G590

Subtest 2: Parallel deblocking decision solutions


G255: Horizontal and vertical filtering can be done in parallel, additional buffers required; may have
disadvantage in lowering throughput for sequential operation.
G256: No additional buffers needed. On/off decision made on 2 center lines/columns out of 8 (instead of
HM which uses the outer lines/columns. (256 includes additional modifications that were not in the
original plan: modified decision for vertical edge re-modified as in JM).
G088: Changes back to HM3 (or AVC style) where horizontal decision would not depend on vertical
filtering, as it is argued that this reduces throughput.
Would G256 solve this problem?
Presentation of G256:
Has two aspects: change on/off decision (samples 3+4 instead of 2+5 at the vertical edge) and
strong/weak filter decision
Both could be tested differently
Doing things different horizontally/vertically is inconsistent
Decision: Adopt G088 [Check: It is presumed that this was intended to refer to G088 rather than G308.]
(i.e. remove parallel decision part, i.e. in that aspect go back to HM3)
No visual tests need to be performed on subtest 2

Subtest 3: Line memory reduction


G257 and G228 would be in conflict with CE8 subtest c (reduced line buffer in ALF)

Page: 72 Date Saved: 2011-12-04


G229 was tested visually – did not have worse quality. Decision: Adopt G229.
Note: G230 is somehow a combination of G257 and G228 with new elements

Subtest 4: Reduction of computational complexity of deblocking filter


G086: Seems to be a minor decrease in complexity as this case may not occur too frequently
No support of other companies – no need for visual testing

Subtest 5: Controlling strength of deblocking filter at slice level


G466: BS intra offset for luma and chroma
G174: tc_offset and beta offset
G574: Parameter on/off flag (for presence of parameters) and 5 weights (quantized to 6 bits) for delta
values of luma filters (3 for strong, 2 for weak)
574 also includes a new method of adapting (more subjective targeted) which was not in the original
proposal but is non-normative
G466 and G174 keep the filter unchanged. G574 requires one more multiplication by the weight factor
per decision (after tc clipping)
Test all 3 (both old and new method for 574) in subjective tests to see whether it gives benefit
Note: G291 extends G174 in defining a reasonable syntax – BoG on this later in the week.
Establish BoG (A. Norkin, G. v.d. Auwera, M. Narroschke, T. Yamakage etc.) to discuss about cleaning
up the de-blocking description and propose a set of adaptation parameters as from CE12 subtest 5, G291,
G619, …, as well as an appropriate place of such signalling (e.g. slice, APS …), and prepare for a CE.

4.12.2 Contributions

Subtest 1

JCTVC-G286 CE12 Subtest 1: Chroma Deblocking Filter [Q. Huang, J. An, X. Guo, S. Lei
(MediaTek), A. Norkin, K. Andersson, R. Sjöberg (Ericsson)]
This contribution presents experimental results for the chroma deblocking filter in CE12 Subtest 2. In
specific, chroma deblocking filter with independent filtering decision and 8x8 filtering unit is tested and
proposed. It is reported that, average BD-Rate reduction of 0.6% can be achieved for chroma. The run
time is reported to be similar to HM4.0. It is also reported that the subjective quality is almost the same as
that of HM4.0.

JCTVC-G410 CE12, Subset 1: Cross-verification report for MediaTek and Ericsson's


proposal (JCTVC-G286) [Z. Shi (USTC), J. Xu (Microsoft)]

Page: 73 Date Saved: 2011-12-04


JCTVC-G287 CE12 Subtest 1: Simplified Chroma Deblocking Filter in JCTVC-G286 [Q.
Huang, J. An, X. Guo, S. Lei (MediaTek), A. Norkin, K. Andersson, R.
Sjöberg (Ericsson)]
This contribution presents the results of chroma deblocking filter in CE 12 Subtest 1. In specific, chroma
deblocking filter with dependent filtering decision on luma and 8x8 filtering unit is tested and proposed. It
is reported that an average BD-Rate reduction of 1.0% can be achieved for chroma. For low delay cases,
an average BD-Rate reduction of 0.1% can also be achieved for luma. The run time is reported to be
similar to HM4.0. It is also reported that the subjective quality is almost the same as that of HM4.0.

JCTVC-G411 CE12, Subset 1: Cross-verification report for MediaTek and Ericsson's


proposal (JCTVC-G287) [Z. Shi (USTC), J. Xu (Microsoft)]

JCTVC-G383 CE12 Subtest1: Deblocking of New Non-Square Blocks: Edge Shift for AMP
[G. Van der Auwera, X. Wang, M. Karczewicz (Qualcomm)]
This proposal addresses the adaptation of the deblocking filter in case of AMP partitions of size 16x4 or
4x16. Instead of deblocking the central edge on the 8x8 deblocking grid inside the 16x16 CU of the AMP
type, the relevant internal AMP partition edge is deblocked, which keeps the number of filtering
operations unchanged. The deblocking filter width is adapted to avoid filtering dependencies between
nearby edges. The BD-rates and execution times are very similar to the HM4 anchor.

JCTVC-G485 CE12.1.5: Crosscheck for Qualcomm's AMP deblocking with edge shift in
JCTVC-G383 [T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G409 CE12, Subset 1: Report of Deblocking for Large Size Blocks [Z. Shi (USTC),
X. Sun, J. Xu (Microsoft)]
This document presents a deblocking scheme for large size blocks to improve the visual quality of HEVC
decoded videos. For large smooth regions with small variation, an extra smoothing deblocking mode is
introduced to suppress the visually severe blocking artifacts. It is observed that the proposed method can
reduce blocking artifacts in smooth regions, which are usually more visible to human eyes.

JCTVC-G673 CE12 Subtest 1: Crosscheck for Microsoft's Deblocking for Larger Blocks in
JCTVC-G409 [Q. Huang, X. Guo (MediaTek)]

JCTVC-G590 CE12 Subtest 1: Results for modified decisions for deblocking [M.
Narroschke, S. Esenlik, T. Wedi (Panasonic)]
This contribution is part of CE12. It presents the results for Modified decisions for deblocking which is
based on JCTVC-E251 and JCTVC-F191. In HM4.0, a first decision for enabling the deblocking is
performed for edge segments of eight lines. In the case of enabled deblocking, a subsequent second
decision is performed for each individual line by which either a strong or a weak filter is selected. In this
proposal, two modifications are introduced. The first decisions are performed for edge segments of four
lines instead of 8 lines. The second decisions are also performed for edge segments of four lines instead
of for each individual line. At the same quality, the following average bit rate reductions are achieved
relative to HM4.0: I-HE: 0.0%, I-LC: 0.0%, RA-HE: 0.0%, RA-LC: 0.0%, LD(B)-HE: 0.2%, LD(B)-LC:
0.1%, LD(P)-HE: 0.2%, LD(P)-LC: 0.0%. The modifications reduce the number of operations required
for these two decisions by around 20%. In addition, the size of line buffers is reduced. They allow parallel
deblocking of all 8x8 blocks.
Page: 74 Date Saved: 2011-12-04
JCTVC-G238 CE12 Subtest1: Cross-verification of Panasonic's proposal JCTVC-G590
[M. Ikeda, T. Suzuki (Sony)]

JCTVC-G621 CE12: Cross-verification of F191 from Panasonic [A. Norkin (Ericsson)]


[late]

Subtest 2

JCTVC-G255 CE12 Subtest2: Parallel deblocking improvement based on JCTVC-F214


tool1 [M. Ikeda, J. Tanaka, T. Suzuki (Sony)]
This contribution proposes one approach to improve and develop the parallel deblocking filter (JCTVC-
E181 and JCTVC-E224). This approach is both horizontal (H) and vertical (V) filtering process applied to
reconstructed samples. This proposal makes the dependency between H and V filter reduced, and then
both H and V filtering process can be performed in parallel. In addition it is expected that the necessary
memory to store for H filtered samples in AVC is made redundant. Sony has two proposals, and one is
basic proposal based on JCTVC-F214 tool1 and the other one is improved one based on the basic
proposal. The experimental results show 0.0-0.2% in BD-rate regarding basic proposal, -0.1-0.2 in BD-
rate regarding improved one, and the subjective quality is similar to HM-4.0.

JCTVC-G585 CE12 Subtest 2: Cross-check results of the parallel deblocking tool 1 of Sony
(JCTVC-G255) [Matthias Narroschke, Semih Esenlik (Panasonic)]

JCTVC-G256 CE12 Subtest2: Parallel decision improvement and reduction based on


JCTVC-F214 tool2 [M. Ikeda, J. Tanaka, T. Suzuki (Sony)]
This contribution proposes to improve the parallel deblocking filter (JCTVC-E181 [1] and JCTVC-E224
[2]). This approach is decision by using the central non-deblocked samples in block boundary. The
decision with the central non-deblocked samples means the decision for reconstructed samples because
the decision regarding “on/off decision” and “strong/weak filter selection” is performed by the central
non-deblocked samples. Sony has two proposals, and one is basic proposal based on JCTVC-F214 tool2
[3] and the other one is improved one based on the basic proposal. The experimental results show 0.0–
0.3% increases in BD-rate and 0–2% decreases in decoding time regarding basic proposal and 0.0–0.2%
increases in BD-rate and 0–2% decreases in decoding time regarding improved one. Therefore it is
expected that the necessary memory to store for the results of the decision is made redundant with
keeping the similar performance to HM-4.0 and that it is possible to save the computation in the decision
process, and the subjective quality is similar to HM-4.0.

JCTVC-G587 CE12 Subtest 2: Cross-check results of the parallel deblocking tool 2 of Sony
(JCTVC-G256) [Matthias Narroschke, Semih Esenlik (Panasonic)]

JCTVC-G103 CE12 subset 2: cross-verification of SONY’s parallel de-blocking filtering


proposals JCTVC-G255 and JCTVC-G256 [M. Zhou, V. Sze (TI)]

Page: 75 Date Saved: 2011-12-04


JCTVC-G622 CE12: Cross-verification of parallel deblocking from Sony F14 tool 2
[Andrey Norkin (Ericsson)] [late]

Subtest 3

JCTVC-G228 CE12.3.2: Reducing pixel line buffers by modifying DF for horizontal LCU
edges [C.-W. Hsu, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE12.3.2, which is based on JCTVC-F053 method2 to modify
deblocking filter (DF) only for horizontal LCU boundaries and to keep DF unchanged for rest edges.
Pixels above the first row of the upper side of the horizontal LCU boundary are not used in filtering
decisions. Moreover, filtering operations are also modified without changing pixels above the first row of
the upper side of the horizontal LCU boundary. In comparison with HM-4.0, the proposed method can
remove all pixel line buffers dedicated for DF and reportedly causes 0.0-0.2% coding efficiency loss with
roughly unchanged run time and similar visual quality in most cases.

JCTVC-G253 CE12 Subtest3: Cross-verification of MediaTek's proposal JCTVC-G228


[M. Ikeda, T. Suzuki (Sony)]

JCTVC-G229 CE12.3.3: Reducing motion data line buffers [T.-D. Chuang, C.-Y. Chen, Y.-
W. Huang, S. Lei (MediaTek)]
This contribution reports results of CE12.3.3, which is based on the motion data compression method in
JCTVC-F060. Simulation results reportedly show that the proposed method can reduce motion data line
buffer size by 50% with the same coding efficiency, encoding time, and decoding time in comparison
with HM-4.0. No undesirable visual artifact is observed due to the modified motion data for calculating
boundary strengths in deblocking filter.

JCTVC-G292 CE12 Subtest3: Cross Check of Mediatek’s Motion Data Line Buffer
Reduction Proposal JCTVC-G229 [G. Van der Auwera (Qualcomm)]

JCTVC-G257 CE12 Subtest3: Deblocking vertical tap reduction for line buffer based on
JCTVC-F215 [M. Ikeda, T. Suzuki (Sony)]
This contribution proposes to reduce the line buffers required in deblocking filter base on JCTVC-F215.
A lot of line buffers are required in deblocking filter, SAO (Sample Adaptive Offset) and ALF (Adaptive
Loop Filter) in HM-4.0. Especially, deblocking filter is included in both high efficiency and low
complexity and it is considered that deblocking filter is used in many cases, and so it is significant that the
line buffers in deblocking filter are reduced alone. Sony proposes to reduce one line buffer with keeping
BD-Rate and subjective quality by reducing upper one pixel to read in vertical filtering. The experimental
results show 0.0-0.1 increases for luma in BD-rate and similar run-time, and the subjective quality is
similar to HM-4.0.

JCTVC-G486 CE12.3.1: Crosscheck for Sony's deblocking vertical tap reduction for line
buffer in JCTVC-G257 [C.-W. Hsu, Y.-W. Huang (MediaTek)] [late]

Page: 76 Date Saved: 2011-12-04


JCTVC-G180 CE12: Cross-check report of subtest 3 on Deblock filter line memory
reduction [S. Park, B. Jeon (LGE)]

Subtest 4

JCTVC-G087 CE12 subset 4.10: Test results on unification of luma and chroma filtering
[M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 4.10 “unification of luma and chroma filtering”. In
the proposed algorithm the unification of luma and chroma filtering is achieved by increasing filter
coefficient precision for chroma filter by 2-bit and restoring HM3.0 delta calculation for luma weak
filter. Simulation results revealed that the proposed unification improved the coding efficiency in luma by
0.2% in AI-HE and AI-LC, and 0.1% in RA-HE, RA-LC, LB-HE and LB-LC, and up to 0.3% gain for
chroma components. However, unified luma weak and chroma filtering led to visual quality loss in
vidyo3 (LB-HE, QP=37) sequence.

JCTVC-G249 CE12 Subtest1: Cross-verification of TI's proposal JCTVC-G087 [M. Ikeda,


T. Suzuki (Sony)]

Subtest 5

JCTVC-G088 CE12 subset 5.6: Test results and architectural study on de-blocking filter
without parallel on/off filter decision [M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 5.6 “removal of parallel on/off filter decision”.
Architectural study shows that the parallel on/off decision actually restricts architecture choices, increases
implementation costs in terms of memory reads and buffer size without intended throughput benefits. It is
recommended to restore the AVC fashion of on/off filter decision, that is to use the un-filtered samples
for the on/off filter decision of vertical edges, and the inter-mediate filtered samples (i.e. filtered samples
after vertical edge filtering) for the on/off decision of horizontal edges. Test results reveal that this change
leads to 0.0% BD-rate difference, and subjective viewing verifies that there is no visual difference for all
the CE12 selected subjective testing sequences when compared to the HM4.0 anchor.

JCTVC-G588 CE12 Subtest 2: Cross-check results of the deblocking proposal of Texas


Instruments (JCTVC-G088) [Matthias Narroschke, Semih Esenlik
(Panasonic)]

JCTVC-G746 CE12 subset 5.6: Cross-check report on parallel deblocking decision


(JCTVC-G088) [I.-K. Kim, J. Chen (Samsung)]

JCTVC-G574 CE12 Subtest 5: Deblocking filter using adaptive weighting factors


[Matthias Narroschke, Ann-Kathrin Seifert (Panasonic)]
This contribution presents the results for a Deblocking filter using adaptive weighting factors which is
based on JCTVC-F405. Five adaptive weighting factors are introduced into the deblocking filter of the
luminance signal. Each weighting factor weights the value, by which the HM4.0 deblocking filter would
modify a sample. The weighting can be interpreted as an adjustment of the frequency responses of the

Page: 77 Date Saved: 2011-12-04


deblocking filter. In this contribution, the weighting factors are estimated by two methods at the encoder
side. Both methods minimize the mean squared error between the deblocked signal and the original input
signal. The difference between both methods is that the second one considers the subjective importance of
the samples during the estimation. However, any other estimation method is also possible since the
encoded weighting factors are coded and transmitted to the receiver side. It is also possible to use default
weighting factors, which lead to deblocking filter operations equivalent to HM4.0. For the first estimation
method, average BD-bit rate reductions of I-HE: 0.4%, I-LC: 0.2%, RA-HE: 0.3%, RA-LC: 0.2%, LD-
HE(B): 0.5%, LD-LC(B): 0.2%, LD-HE(P): 0.3%, LD-LC(P): 0.1% are achieved compared to HM4.0
with approximately no encoder/decoder run time increases. For the second estimation method, slightly
lower average BD-bit rate reductions are achieved at no encoder/decoder run time increases. However,
blocking artifacts are visible removed, especially for high QP coder configurations.

JCTVC-G237 CE12 Subtest5: Cross-verification of Panasonic’s proposal relating to


JCTVC-F405 [S. Lu, M. Ikeda, T. Suzuki (Sony)]

Subtest 6(?)

JCTVC-G174 CE12: Deblocking filter parameter adjustment in slice level [T. Yamakage,
S. Asaka, T. Chujoh (Toshiba), M. Karczewicz, I.S. Chong (Qualcomm)]
Appropriate parameters for deblocking filter to improve coding efficiency for CE12 are presented.
Offsets to Qp to derive beta and tc in slice level syntax are introduced in order to adjust subjective and/or
objective picture quality. The purpose of this contribution is to provide placeholder to adjust the picture
quality.
When Qp offsets to derive beta and tc offsets are -2 and -5, BD-rate reduction is 0.0% (HE) and 0.4% loss
(LC) on average under the common test conditions, with maximum BD-rate reduction of 0.7% (HE) and
0.3% (LC). When higher Qp (32, 37, 42 and 47) is used, BD-rate reduction is 0.4% loss (HE) and 0.6%
loss (LC) on average, with maximum BD-rate reduction of 0.6% (HE) and 0.3% (LC).

JCTVC-G465 CE12: crosscheck of deblocking filter parameter adjustment in slice level [K.
Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)]

JCTVC-G466 CE12: results on signaling of boundary filtering strength [K. Sugimoto, A.


Minezawa, S. Sekiguchi (Mitsubishi)]
This contribution reports verification results on CE12 Subtest 5 that evaluates signaling of boundary
filtering strength at slice header. The verification work was performed using the software implemented
the proposed technique on top of HM-4.0. Reported performance results were obtained by building and
running the software on 64bit Linux platform.

JCTVC-G388 CE12: Cross-check of Mitsubishi's deblocking filter JCTVC-F175/Gxxx [T.


Yamakage, S. Asaka (Toshiba)]

Page: 78 Date Saved: 2011-12-04


Subtest 7

JCTVC-G086 CE12 subset 7.4: Test results on decreasing worst case complexity of
deblocking filter [M. Zhou, O. Sezer, V. Sze (TI)]
This contribution reports test results on CE12 subset 7.4 “decreasing worst case complexity of deblocking
filter”. By removing the motion vectors from the boundary strength (BS) calculation of the deblocking
filter, the worst case number of operations, and memory access are reduced from (8, 10) to (1, 4), and
from (39, 20) to (7, 4) for the BS calculation of a de-blocking edge in P-frame and B-frame, respectively.
Subjective tests at TI observed visual quality improvement in BQMall + (random access-high efficiency)
+ QP 37 (ringing artifact around diagonal edge has been reduced), and no subjective difference in other
CE12 selected sequences when compared to the HM4.0 anchor. The proposed simplification leads to an
average BD-rate increase of 0.2% in RA-HE, RA-LC, LB-HE and LB-LC configuration.

JCTVC-G251 CE12 Subtest4: Cross-verification of TI's proposal JCTVC-G086 [M. Ikeda,


T. Suzuki (Sony)]

JCTVC-G623 CE12 : Cross-verification of deblocking simplification from TI F484


[Andrey Norkin (Ericsson)] [late]

JCTVC-G1041 reports on informal subjective testing. The results unveil that there is no subjective quality
difference that can be claimed between any of the proposals.

4.12.3 Discussion and Conclusions

4.13 CE13: Motion data parsing robustness and throughput

4.13.1 Summary

JCTVC-G043 CE13: Summary report of Core Experiment 13 on motion data parsing


robustness and throughput [J. Jung, B. Bross, J. Chen, P. Onno (CE
coordinators)]
This Core Experiment relates to motion data parsing robustness and throughput. During the 6th JCT-VC
meeting, a simplified version of JCTVC-F470 has been adopted in HM4.0. The goal of this CE is to
evaluate various configurations or simplifications of different methods proposed for HM3.0, and evaluate
them in the context of HM4.0.
 Decision (SW): Agree on ENC_MRG_FIX (bug fix from experiment S0 from G776)
 Signaling number of candidates in slice header (T1…T4 vs. T5…T8): Gives a small benefit
(0.3% BR reduction for case of 3 candidates) compared to the case where the coder restricts the
number of candidates without changing syntax. Goal: Tradeoff encoding time vs. complexity, no
impact on decoder. It is also shown that another option is adaptation to sequence characteristics,
where it give a small benefit in case of LD P (see
Support by several companies: Discussion: Better in APS? No – slice header is better place. Decision: 
Aadopt T1…T4 from G091, S1 from G776
 Pruning number of MVP candidates T9/T10 and C1..C4:
We should target here for proposals that simplify (preferably without loss), and not those which remove
something and put in something else with little gain
Page: 79 Date Saved: 2011-12-04
Note: Initial decision to adopt T10 later became obsolete due to the adoption of G397+542simp3.
Process suggested by CE coordinators: Identify adoptions from CE9 & 13, integrate software, only after
that consider inclusion of non-see

4.13.2 Contributions

JCTVC-G091 CE13: Test results on maxNumMergeCand signaling and simplification of


merge MVP list pruning process [M. Zhou (TI)]
This document reports CE13 test results on the following two tools: signaling the maximum number of
merge/skip MVP candidates (maxNumMergeCand) in slice header (signaling method), and simplification
of the second pruning process of the merge/skip MVP list derivation process. (simplification method).
The signaling method enables the truncated unary table of accurate length to be used for merge_idx
coding, while the simplification method reduces the complexity of the second pruning process by 2x,
thanks to the reduction of the total number of combined, non-scaled and zero MVP candidates from 8 to
5. Experimental results reveal that the proposed signaling method provides e.g. an average of gain of
0.1% in RA-HE, 0.3% in RA-LC, 0.2% in LB-HE and 0.7% in LB-LC when maxNumMergeCand equal
to 3 when compared to the HM4.0 encoder only method, which uses a fixed truncated unary table of
maximum length 5 for merge_idx coding regardless of number of merge candidates supported by the
encoder. The gain is significantly larger in LC configurations. Results also showed that the proposed
simplification method and adding it on the top of the signaling method does not cause coding loss.

JCTVC-G231 CE13: Results of section 3.1 tests 1, 3d, and 3e on replacing redundant
MVPs and its combination with adaptive MVP list size [J.-L. Lin, Y.-W.
Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports the results of CE13 section 3.1 tests 1, 3d, and 3e, which are based on JCTVC-
F052. In test 1, redundant or empty MVPs are replaced by truncating the first available MVP to integer
precision or by adding a constant value to the first available MVP. In test 3d, test 1 is combined with
adaptive MVP list size by neighboring Merge indices. In test 3e, test 1 is combined with adaptive MVP
list size by current CU size. In comparison with HM-4.0 under JCTVC-F900 common test conditions, it is
reported that test 1 together with a bug-fix achieves 0.2-0.5% coding efficiency gain with 100-103%
encoding time and 98-100% decoding time, test 3d together with a bug-fix achieves 0.0-0.4% coding
efficiency gain with 98-103% encoding time and 99-102% decoding time, and test 3e together with a bug-
fix achieves 0.1-0.5% coding efficiency gain with 100-102% encoding time and 100-101% decoding
time, where the bug-fix alone achieves 0.1% coding efficiency gain and no run time difference.

JCTVC-G236 CE13: Cross-check of Mediatek results section 3.1 [G. Laroche, P. Onno
(Canon)] [late]

JCTVC-G686 CE13: Crosscheck report of JCTVC-G231 Test 1 [Y. Zheng, X. Wang


(Qualcomm)]

JCTVC-G845 CE13 Section 3.1: Cross-verification of MediaTek’s test 3d (JCTVC-G231)


[T. Lee, J. Chen, J. H. Park (Samsung)]

Page: 80 Date Saved: 2011-12-04


JCTVC-G232 CE13: Results of section 3.1 tests 2d and 2e on adaptive MVP list size [J.-L.
Lin, Y.-W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution reports the results of CE13 section 3.1 tests 2d and 2e. Based on JCTVC-F052, two
proposed tools to determine the MVP list size adaptively are evaluated. In tool 1, the MVP list size of
each PU is adaptively determined by merge indices of neighboring PUs. In tool 2, MVP list size is
determined by the current CU size. In comparison with HM-4.0 under JCTVC-F900 common test
conditions, it is reported that tool 1 achieves 0.0-0.1% coding efficiency loss with 98-100% encoding
time, and tool 2 achieves 0.0-0.1% coding efficiency loss with 99-100% encoding time.

JCTVC-G240 CE13: Experiment regarding section 3.5 [G. Laroche, T. Poirier, P. Onno
(Canon)]
This contribution reports the results of experiments for section 3.5 of CE13 as described in JCTVC-F913.
6 experiments have been proposed in the field of parsing robustness for both AMVP and Merge modes.
The four first proposed experiments replace some of the additional candidates of the current HM4.0
motion vector derivation by non-redundant candidates as proposed in JCTVC-F474. These four
experiments correspond to different compromises between the coding efficiency and the complexity
reductions in terms of number of operations for the Merge mode MVP derivation process compared to the
HM4.0. The 2 other experiments deal with the modification of the motion vector predictor index parsing
for AMVP. The best experiment in term of coding efficiency shows an average gain for 4 Inter coding
configurations of 0.3% coding compared to the HM4.0 anchors without any increase number of
operations. Moreover, some configurations divide by 3 the worst case complexity in terms of number of
predictors and number of comparisons with a BDR gain of 0.1%.

JCTVC-G424 CE13: Cross-check report of JCTVC-G240 (section 3.5 test 1 and test 4) [S.-
C. Lim, J. Lee, H. Y. Kim (ETRI)]

JCTVC-G489 CE13: Crosscheck for Canon's section 3.5 tests 5 and 6 on additional MVP
candidates in JCTVC-G240 [J.-L. Lin, Y.-W. Huang (MediaTek)]

JCTVC-G842 CE13 Section 3.5: Cross-verification of Canon’s test 3 (JCTVC-G240) [T.


Lee, J. Chen, J. H. Park (Samsung)]

JCTVC-G776 CE13: Merge candidates list construction [T. Lee, J. Chen, J. H. Park
(Samsung)]
This document reports the results of “merge candidates list construction” method proposed in document
JCTVC-F402 within the context of CE13. Three tests were done in this contribution: a) encoder
modification of HM4.0; b) Optimal merge list size derived in encoder side and signal it in slice head; c)
inserting additional merge candidates. Experiments show that the encoder side fix achieves average -
0.11% BD rate saving for six inter configurations without enc/dec time change. Merge list size signaling
in slice header shows average 0.08% BD rate loss with 96.9% encoding time. Inserting additional merge
candidates shows average -0.17% BD rate saving for six inter configurations based on encoder fix version
with 102.8% encoding time.

Page: 81 Date Saved: 2011-12-04


JCTVC-G487 CE13: Crosscheck for Samsung's section 3.4 test 5' on adaptive list size and
additional MVP candidates in JCTVC-G776 [J.-L. Lin, Y.-W. Huang
(MediaTek)]

JCTVC-G191 CE13: Cross-verification result of JCTVC-F402 Test3 [K. Kazui (Fujitsu)]

JCTVC-G275 CE13: cross-check of experiments T1 (TI - JCTVC-G091), S1 (Samsung –


JCTVC-G776) and C2 (Canon - JCTVC-G240) [J. Jung (Orange Labs)]
[late]

JCTVC-G514 CE13: Cross-Check Result of Test3b&4b [K. Sato (Sony)]

JCTVC-G539 CE13: Cross-check report of subset 3.3 (test2a, 3a, 4a) by Panasonic [T.
Sugio, T. Nishi (Panasonic)]

JCTVC-G540 CE13: Cross-check report of subset 3.2 series by Panasonic [T. Sugio, T.
Nishi (Panasonic)]
Revisit: Side activity regarding CE9/CE13 decisions still going on about testing the combination of
adoptions. Will be reported on Monday morning.

4.13.3 Discussion and Conclusions

5 Non-CE Technical Contributions


5.1 Clarification and Bug Fix Issues

JCTVC-G116 On PCM memory usage reduction in HM software [K. Chono, H. Aoki


(NEC)]
Just an implementation change – delegated to software coordinator.

5.2 HM settings and common test conditions

JCTVC-G111 Common test conditions to specify 8-bit internal bit depth for all 8-bit source
material [T. Hellman, Y. Yu, W. Wan (Broadcom)]
This proposal recommends changing the common test conditions to specify an internal bit depth (IBD) of
8 bits for all 8-bit source material. It presents results that show a coding loss from this change relative to
the current 10-bit IBD configuration, but claims that the cost of 10-bit IBD cannot be justified. A cost
increase of 20-23% is reported for a sample hardware codec implementation as well as additional memory
and bandwidth costs. It also recommends keeping 10-bit IBD for 10-bit source material, in preparation for
a future 10-bit encoding profile.

Page: 82 Date Saved: 2011-12-04


In some cases, the benefit of 10 bit vs. 8 bit is high for chroma (extreme more than 100% chroma bit rate
increase for steam locomotive). It is said that this is particularly observed for sequences when chroma
PSNR is high. Typical loss in chroma around 6%

Loss of 8 bit relative to 10 bit for HE is bigger in chroma (10% for RA & LB, including Class F) than
luma (2.2% for RA & LB), and is focused in particular sequences (removing one sequence drops the
chroma average gain to about 6%).
Class F is included in the results
Excluding class F, there is more gain, as there is essentially no observed gain on class F.

It is mentioned in the discussion that 10 vs. 8 bit most likely would be a profiling issue. Several experts
anticipate that the definition of an 8-bit 4:2:0 profile with high coding efficiency will be defined.
(confirmed by Broadcom, Docomo, Cisco and TI)
Would a 10-bit (or higher) profile be in the first version of the standard? (several experts say that this
could be deferred to a later version, at least not at high priority)
For common test conditions (that will need to be re-defined as there is only one entropy coder now):
 Only 8-bit test settings in common conditions
 The capability of higher bit depth should be retained in the software and the spec
 Keep a mode to test with higher bit depth to verify that any tools are in principle extensible (10?
12?)
 Also include parameter settings for tool combinations that are not currently checked in common
conditions
 The latter two points could be done for a largely reduced test set (as it is not about compression
performance but rather sanity check)
Suggestion to define one test point with ALF on and one off. Some experts argue that also SAO should be
on/off at these points -– agreed. Would also be beneficial to test interpolation filters with SAO off.
Decision (SW): It was agreed that we need a high coding efficiency profile that has only 8 bit decoding
capability. And we should have a set of common test conditions that corresponds to that.
It was commented that it may be that we may not define a profile with greater than 8 bit capability in the
first version of the standard (e.g. so that we could later define a single profile that covers both greater than
8 bit capability and higher-resolution chroma formats).
It was commented that the current 10 bit sequences are rather noisy, so if we don't have 10 bit encoding in
the common conditions, we may no longer need those sequences.

We definitely want to retain higher bit depth capability in the design and software. So we should still
include higher bit depth capability in the common conditions.
This was agreed.
It was suggested to develop a supplemental set of tests for other aspects as well that differ from the main
common conditions (e.g. in QP and CU size as well as bit depth). This idea was supported.
It was suggested that perhaps our LC common conditions should no longer include SAO.

Page: 83 Date Saved: 2011-12-04


It was suggested that the MC interpolation filter should perhaps be different if SAO is not going to be
used.
BoG (Frank Bossen) to discuss and propose future test conditions. Also discuss whether Steam
Locomotive and Nebuta (10 bits) will still be used, or stripped down to 8 bit.
Discussion on Nov. 29:
Desire to create a set of alternate configurations for the purpose of testing robustness and make sure new
integration do not break other tools typically not used in common test conditions. It was agreed to add
mandate to AHG3 to do this.
Suggestions for differences between HE and LC configs:
 ALF
 SAO
 NSQT
 AMP
 32x32 LCU vs 64x64 LCU
 32x32 T vs 16x16 T
There was no consensus on whether to use 1 or 2 configurations.

JCTVC-G136 Suggestion on picture quality hierarchy for Low Delay configurations [S.
Liu, X. Zhang, S. Lei (MediaTek)]
This contribution proposes a modification to the picture quality hierarchy for Low Delay settings in the
current HM. It is proposed to replace the current multi-level hierarchical picture quality structure by a
two-level scheme. Average 0.2-0.3% BD-rate reduction is reported for Luma and average 1.8% BD-rate
reduction is reported for Chroma. No impact on encoding or decoding time is reported.
Would this affect the visual quality (as quality fluctuations are at lower frame rate)?
Gain is relatively small. Subjective characteristics were discussed.
Changes may tend to make historical comparisons more tricky – gain seems not so large as to justify a
change at this time.

JCTVC-G634 Cross check of MediaTek's G136: Suggestion on picture quality hierarchy


for Low Delay configurations [E. S. Ryu, Y.Ye (InterDigital)] [late]

JCTVC-G150 Proposed Error Pattern Files for JCT-VC [S. Wenger (Vidyo)]
Adopted as the current preferred method for testing robustness characteristics and proposals.

JCTVC-G399 Comparison of Compression Performance of HEVC Working Draft 4 with


AVC High Profile [B. Li (USTC), G. J. Sullivan, J. Xu (Microsoft)]
TBA.
Uses “combined PSNR” (weighted superposition of luma and chroma) – see also G401
Better AVC anchor was used than in CfP (8% better for RA-HE)
Average gains are 39% for RA, 44% for LD, 25% for AI (all for HE)

Page: 84 Date Saved: 2011-12-04


It was noted that subjective quality is our primary goal, and this contribution does not test subjective
quality. The combined PSNR metric was noted and is further discussed in G401.

JCTVC-G678 A tool for rate-constrained performance test [S. Campbell, J. Wang, X. Yu


(RIM)]
This document describes a set of tools that can be used to run the encoder multiple times to obtain a given
set of bit rates. This is achieved by adding a new set of parameters, the Lambda-factors, to the encoder.
These Lambda-factors provide control over the encoded bit rates. The shell script, targetBitrates.sh, will
run the encoder many times while adjusting the Lambda-factors until the desired bitrates are obtained.
The tool controls the bit rate allocation for each temporal layer.
The tool was used in CE4 subtest 2.b.
The actual files used as this tool were not included in the contribution, but will be added in a new version.
Some change to the reference software was needed to work with this.
It was asked whether this should become part of our reference software. This was supported and agreed.
(It was offered to be provided with regular licensing header.)

JCTVC-G855 Performance evaluation of full search mode decision for Intra of HM4.0 [C.
Lai, Y. Lin, L. Liu, J. Zheng (HiSilicon)] [late]
Late information document (not presented in detail, no action expected) – available for study.

JCTVC-G910 Non-CE6: Cross-check of HiSilicon’s performance evaluation of full search


mode decision for intra of HM4.0 (JCTVC-G855) by Panasonic [S. M. T. Naing, V.
Wahadaniah, C. S. Lim (Panasonic)] [late]

5.3 Source video test material

JCTVC-G732 Study on test materials in common test condition [T. Suzuki (Sony)] [late]
TBA
In JCTVC-E011, the problem of the class E test materials in the common test condition was reported. The
contribution investigates the reason why this is happened and proposes to reconsider to replace the class E
test materials.
Discusses problems detected on class E sequences. It was reported to seem likely that the material was
produced by interlaced camera with compression & de-interlacing.

It was reported that the pixel values are changed frame by frame, even at the still area. This phenomenon
could impact on the evaluation of the video coding tools. For example, tools to use this line by line
change could improve coding efficiency in the current design.
In discussion, it was indicated by the sequence contributor that it was captured by a Sony camera that is
switchable between 1080i and 720p.
The company (Vidyo) that contributed the Class E sequences indicated that they should be able to provide
new material with similar scene content. The group indicated that such a contribution is requested and
would be appreciated. This can be collected and made available in an AHG activity in advance of the next
meeting (chair T. Suzuki).

Page: 85 Date Saved: 2011-12-04


5.4 Functionalities

5.4.1 Scalable coding

JCTVC-G078 Information for HEVC scalability extension [J. Boyce, D. Hong, W. Jang, A.
Abbas (Vidyo)]
For information to JCT-VC. Out of scope of current phase of work.

JCTVC-G248 Low Complexity scalable extension of HEVC intra pictures [S. Lasserre, F.
Le Léannec, E. Nassor (Canon)]
This contribution presents a new approach for scalable extension of HEVC INTRA pictures. This scalable
INTRA codec design targets coding efficiency together with very low complexity. Spatial random access
and a high degree of parallelism are two additional targeted features. The proposed scalable INTRA
codec employs only one coding mode, which is inter-layer intra prediction, which provides low
complexity. Coding efficiency is obtained through statistical modeling of DCT channels to encode, rate
distortion optimal quantifiers that are pre-computed off-line, coupled with a distortion allocation process
between DCT channels. Overall, bit rate increase of 12.8% is obtained relative to HEVC single layer
coding on tested sequences in dyadic spatial scalability mode. Finally, non-contextual, non-adaptive
entropy coding provides the spatial random access feature.
Was presented Tue. 29th afternoon in track A.
For information to JCT-VC. Out of scope of current phase of work.

JCTVC-G948 Draft requirements and discussion on the scalable enhancement of HEVC


[A. Luthra] [late]
For information to JCT-VC. Out of scope of current phase of work.

JCTVC-G949 Draft requirements for the scalable enhancement of HEVC [A. Luthra]
[late]
For information to JCT-VC. Out of scope of current phase of work.

JCTVC-G950 Draft use cases for the scalable enhancement of HEVC [A. Luthra] [late]
For information to JCT-VC. Out of scope of current phase of work.

JCTVC-G951 Draft Call for Proposals on the Scalable Video Coding Extensions of HEVC
[A. Luthra (Motorola)] [late]
For information to JCT-VC. Out of scope of current phase of work.

5.4.2 Stereo/Multi-view

JCTVC-G582 Multiview HEVC – experimental results [M. Domanski, T. Grajek, D.


Karwowski, K. Klimaszewski, J. Konieczny, M. Kurz, A. Luczak, R.
Ratajczak, J. Siast, O. Stankiewicz, J. Stankowski, K. Wegner]
This documents presents an approach of providing multiview compression capability in HEVC in similar
way to AVC Annex H. The results for experimental implementation of HEVC-based multiview codec and
prospective performance of multiview prediction in HEVC-based multiview codecs are described. Codec

Page: 86 Date Saved: 2011-12-04


has been implemented using HEVC reference software (HM 3.0), by application the compression scheme
similar to Multiview Video Coding technology (MVC). Coding efficiency of HEVC-based multiview
coder was evaluated and compared to efficiency of simulcast HEVC. Performance of the proposed
encoder was tested with the sequences provided in Call for Proposals (CfP) on 3D Video Coding (3DVC).
The average compression gain of using multiview prediction in video encoder is 22.7% in 2-view case
and 30.5% in 3-view case, relative to simulcast scenario.
For information – contribution noted.
For information to JCT-VC. Out of scope of current phase of work.

5.4.3 Interlace

JCTVC-G170 On issues for interlaced format support in HEVC standard [K. Chono, H.
Aoki (NEC)]

JCTVC-G196 Modification of derivation process of motion vector information for interlace


format [J. Koyama, A. Yamori, K. Kazui, S. Shimada, A. Nakagawa
(Fujitsu)]

JCTVC-G296 Picture-adaptive Field/Frame Coding: support for legacy video [O. Bar-Nir
(Harmonic)]

JCTVC-G450 High level syntax to support interlace format [K. Sugimoto, A. Minezawa, S.
Sekiguchi (Mitsubishi)]

JCTVC-G667 HEVC field coded sequences vs. deinterlaced progressive coding [C. Fogg]
[late] [miss]

JCTVC-G877 Interlaced and 4:2:2 color format support in HEVC standard [J. Vieron]
[late]

JCTVC-G912 On interlaced format [K. Chono, H. Aoki (NEC)] [late]


This contribution was an eleven-company submission. It suggested to specify some form of interlaced
video support for HEVC, and suggested to create a BoG to create "starting point text" toward that end.

JCTVC-G962 Interlace profiling in HEVC [D. Singer] [late]


This contribution emphasized the desire to avoid potential needs to include interlace-handling display
adaption in decoding systems, and requested to retain HEVC as progressive only with no mention of
interlace or fields, or as lesser acceptable alternative, to put any such aspects into a separate profile for
that purpose.

Page: 87 Date Saved: 2011-12-04


JCTVC-G967 4:2:2 support in HEVC [K. Sugimoto, A. Minezawa, S. Sekiguchi
(Mitsubishi)] [late]

General

Discussion:
 Will interlaced displays still exist in the future?
 Interlaced content still exist and continues to exist (interlaced cameras)
 Could this be de-interlaced?
 If done, it should be simple
 De-Interlacing at encoder end would double necessary throughput
 Can this be solved with an SEI message like approach? SEI would not allow frame/field
adaptivity, as it is not possible to have two different picture sizes in one sequence.
 Anything that involves a mode decision is undesirable. Picture adaptive frame/field is
undesirable.
 Would it be useful to invoke an interlace SEI message per profile/level? Some experts think this
is desirable. (H.263 has such an option for another SEI message)
 Work on specify candidate text for SEI message on field coding in an AHG.

Noting that there are basically no interlaced displays anymore (and the availability of source deinterlacing
technology), it was asked why we would bother with this.
Legacy content and legacy camera equipment were cited as a rationale, and it was noted that new cameras
are still being manufactured that generate new such content. It was suggested that the fact that prior
standards include interlace-oriented features might create a need for persistence of support of prior
technologies in addition to support of HEVC.
Regarding deinterlacing at the source, a doubling of codec throughput requirements was suggested to be
unacceptable.
As a reference example, H.263 Annex W (subclause W.6.3.11) was mentioned.
It seems more than likely that if we do not do something, at least as an SEI message, others will do so in a
less interoperable fashion. The associated indicator (as shown by H.263 Annex W) seems simple to
specify.
Suggestion: "Have an enable flag at the VUI level with a top/bottom flag SEI on each picture".
Is it possible to profile an SEI message? It was noted that H.263 Annex X has such a thing.
Plan: Establish AHG with mandate to develop candidate text for an SEI message signaling an interlaced
format indicator. Suggestion quoted above as a starting point for that work.

Page: 88 Date Saved: 2011-12-04


5.5 Loop Filtering

5.5.1 Deblocking filter


Pre-Review took place in a BoG (JCTVC-G1005). Decisions were made in the track A sessions. The
proposals are sorted into categories (although some proposals can belong to more than 1 category).
1. Parallel deblocking
2. Modifications to deblocking filter description
3. Modifications to Bs calculation process
4. Line buffer reduction
5. Deblocking filter simplifications
6. Signaling deblocking filter parameters in slice header
7. Varying QP deblocking and deblocking for IPCM blocks

1. Parallel deblocking

JCTVC-G089 Non-CE12: Enable low-complexity and fully parallelized de-blocking filter


with minor changes [M. Zhou, O. Sezer, V. Sze (TI)]
This contribution advocates the following two changes to the current HM4.0 de-blocking filter design: 1)
compute the on/off decision on 4 lines/columns segment instead of 8 lines/columns segment, and 2)
remove the parallel on/off filter decision to speed up vertical edge filtering. The proposed changes lead to
a low-complexity, fully parallelized de-blocking filter design with enhanced architecture flexibility.
Compared to the current HM4.0 design, the buffer size is reduced from 12x12 to 8x8, and number of
memory access is reduced from 8x12 to 8x8, and from segment to segment the de-blocking filter is fully
independent and parallelized. Experimental results revealed that the proposed algorithm led to 0.1% BD-
rate reduction in LB-HE and LB-LC, and 0.0% BD-rate difference in other configurations when
compared to HM4.0. Subjective tests showed that the proposed algorithm led to visible visual gain in
BQMall (RA-HE, QP=37) sequence, and no visual difference in other CE12 selected subjective testing
sequences.
Results: 0.0/0.0, 0.0/0.0, -0.1/-0.1 for HE-AI/LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Similar to G590 in spirit (however, does not change the strong filtering).
Recommendation: Depending on adoption of G590. If 590 is adopted this would not be further
investigated, otherwise possibly CE

JCTVC-G624 Non-CE12: Cross-verification of parallel deblocking proposal JCTVC-G089


from TI Andrey Norkin [late]

JCTVC-G171 Parallel deblocking filter [J. Yu, S. Yang, J. Byun, Y. Kim, J. Kim (Yonsei
Univ.)]
A parallel deblocking filter for HEVC is proposed. The proposed technique includes the parallelization of
the filtering process and decisions. Our proposed algorithm is based on Panasonic’s parallel deblocking
filter decisions, which were first presented in JCTVC-D214. For the deblocking of a current coding unit,
all required decisions and filtering are performed on the basis of the unfiltered pixels of the current coding
unit. This method can eliminate the dependencies present in the current coding unit and between
neighboring coding units. Therefore, the proposed deblocking filter can be implemented in parallel

Page: 89 Date Saved: 2011-12-04


processing structures, making it applicable to variable structures. Experiments following the common
conditions show that the BD-bit rate reduction -0.1 % in the case of All Intra high efficiency and 0.1 % in
the case of Random Access HE and Low delay HE. The BD-bit rate stays approximately unchanged for
the luminance signal and chrominance signal. This technique can be adopted in the next version of the
HM.
Similar to G255 which was not considered
Slight increase in dec runtime.
No action.

2. Modifications to deblocking filter description

The proposals presented in this section do not modify HM4.0 behavior but modify the working draft text.

JCTVC-G175 BS decision tree simplification [S. Park, N. Park, B. Jeon (LGE), X. Guo, J.
An, C. Hsu, Y. Huang, S. Lei (MediaTek)]
This contribution reports the simplification of BS (boundary strength) decision tree in the deblocking
filter. The proposed simplification removes redundant BS values with related conditions. The number of
BS values is changed from 5 to 3 and this modification provides the same BD rate and visual quality as
HM4.0.
The check for the CU boundary (Bs = 4) is removed. The proposed BS values are 0, 1, 2.
Recommendation: Makes sense, is related to contributions G620 part 1 and G638, parts 1 and 2.
Subjective test: not needed, identical to HM4.0

JCTVC-G959 Cross check of Non-CE12: BS Decision Tree Simplification - G175 [A.


Kotra, M. Narroschke, T. Wedi (Panasonic)] [late]

JCTVC-G620 Clean-up of deblocking filter description [A. Norkin (Ericsson)] [late]


The document proposes certain simplifications to the deblocking filter description and HM 4.0 reference
software. In particular, it is proposed to remove clipping operations from strong filtering and to reduce the
number of Bs values.
Part 1: Removing clipping operations from the deblocking strong filter. Clipping is unnecessary since the
output values cannot get outside the input range. (6 different places)
e.g. replace p1' = Clip1Y( ( p2 + p1 + p0 + q0 + 2 ) >> 2 ) with p1' = ( p2 + p1 + p0 + q0 + 2 ) >> 2.
Editorial change & cleanup of software without changing operation. Decision: Adopt; Check with
software coordinators and WD editors.

Part 2: Check for the CU boundary (Bs = 4) is removed. Decreasing the number of Bs values to 0, 1, 2.
This part of the proposal is identical to G175.

Subjective test: Not needed, proposal is identical to HM4.0

Page: 90 Date Saved: 2011-12-04


JCTVC-G638 Deblocking boundary strength and filtering process simplifications [A.
Kotra, M. Narroschke, T. Wedi (Panasonic)]
This contribution presents three simplifications in boundary strength (bS) derivation and filtering process
for deblocking in HM-4.0. The first simplification removes the “coding unit edge” check for Intra blocks
in bS derivation. In the second simplification, the output boundary strength values of the boundary
strength derivation are adjusted such that the bS value can be directly used in tc (threshold) value
derivation. Therefore, the extra tcoffset (derived based on bS) used in HM-4.0 for tc derivation is not
necessary anymore. The third simplification performs the calculation of thresholds tc and ß only when
bS>0. These simplifications reduce the number of instructions in the deblocking operations while
complying with current HM-4.0.
Part 1. Remove the Bs = 4 and the check for CU boundary. Same as in G620 and G175.
Part 2. Set Bs values to 0, 1 and 3. Remove the tc_offset from the WD text and replace tc_table
[Clip3(0, 55, qPL + tc_offset] with tc_table[Clip3(0, 55, qPL + bS-1)]. One participant supports this
proposal, one is against.
Part 3. If Bs = 0, do not calculate Tc and Beta when Bs = 0. Suggests this change to the working draft
text. Two participants support the proposal. Recommend adopting Part 3 to WD.
Subjective test: not needed, results are identical to HM4.0

Recommendations on Bs calcuation (G175, G620, G638):


DeciSuggestion: Rremove the check for the CU boundary (Bs = 4) and having 3 values for Bs.
Possibilities:
1. Bs = 0, 1, 2 (G175, G620) or
2. Bs = 0, 1, 3 and removing tc_offset (G638).
(Note: tc_offset can have values of 0 and 2 and is redundant with Bs as it is derived from that)
Revisit: Proponents of G175, G620, G638 come with a unified solution which is understandable.
Crosscheck G385 (See under G639)

JCTVC-G1035 BoG report on resolving deblocking filter description [A. Norkin et al.]

[include header 1035]

Suggested changes on cleanup OK

Removing 4x4 block boundary OK as it is not likely that any proposal would be adopted that uses them
Signalling of de-block parameters (beta, tc_offset, on/off) both in APS and slice header – slice can inherit
from APS or use own params
Check with APS experts whether the design of the APS params is appropriate
Cleanup of source code: Plan exists, but can only be done after meeting
Decision: Adopt (subject to checking of WD text and software by editor/coordinator)
3. Modifications to Bs calculation process

Page: 91 Date Saved: 2011-12-04


JCTVC-G176 Simplified BS calculation process [S. Park, B. Jeon (LGE)]
This contribution reports that there is an inconsistency in the minimum unit size between BS (boundary
strength) calculation and boundary filtering in the current HEVC, which makes a redundant BS
calculation and additional BS unification process. The proposed method changes the minimum unit size
of BS calculation unit from 4x4 to 8x8 and removes the additional BS unification process. This
contribution claims that there is no subjective and objective performance degradation (avg. 0% loss) with
a reduced complexity compared with HM4.0.

Currently in HM4.0, Bs decision process for an 8-pel edge consists of finding BS decision for every 4-pel
edge and then a maximum of two 4-pel Bs is found, because the minimum unit of BS and filtering is
different.
The proposal suggests using the Bs of the first 4-pixel part as the Bs for the whole 8-pel edge to reduce Bs
decision operation and remove BS merging process.
The proposal suggests using the Bs of the first 4-pixel part to determine Bs for the whole 8-pel edge. The
average BD-rate change is 0.0% for all configurations.
Results: 0.0/0.0, 0.0/0.0, 0.0/0.0 for HE-AI/LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
One participant supports the proposal.
Recommendation: CE

JCTVC-G675 Non-CE12: Crosscheck for LGE's Simplified Bs Calculation in JCTVC-


G176 [J. An, X. Guo (MediaTek)]

JCTVC-G616 On parallel deblocking [A. Norkin (Ericsson)]


The proposal discusses the parallel deblocking. The proposed solution is based on a parallel deblocking
proposal G089 from Texas Instruments and includes some modifications to it. The proposal also
addresses implementation of parallel deblocking in case of deblocking in CU order
The proposal suggests using separate Bs for two 4-pixel parts of the 8-pixel block boundary in order to
align Bs calculation with transform size. The other modification is to use lines 0 and 4 for deblocking
filtering decision.
The proposal also addresses implementation of parallel deblocking in case of deblocking in CU order.
The average BD-rate gains are between 0.0%(AI) to 0.2%(LDB HE).
Results: 0.0/0.0, 0.0/0.0, -0.2/-0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Recommendation: CE depending on what happens to G590/G089

JCTVC-G973 Non-CE12: Crosscheck for Ericsson's Parallel Deblocking in JCTVC-G616


[J. An, X. Guo (MediaTek)] [late]

4. Line buffer reduction

Page: 92 Date Saved: 2011-12-04


JCTVC-G230 Non-CE12.3: Reducing pixel line buffers by modifying DF to R3W2 for
horizontal LCU edges [C.-W. Hsu, Y.-W. Huang, S. Lei (MediaTek), M. Ikeda, T. Suzuki
(Sony), S. Park, B. Jeon (LG)]
In this contribution, modified vertical filtering with reading 3 pixels and writing 2 pixles (R3W2) for
deblocking filter (DF) is proposed for horizontal LCU boundaries to reduce pixel line buffers required by
DF, while the DF in HM-4.0 is still used for rest edges. The proposed method can save one pixel line
buffer for DF compared with the current HM-4.0. It uses fewer pre-DF pixels above the horizontal LCU
boundary for the filtering decisions, and filtering operations are also modified. Simulation results
reportedly show 0% bit rate increase with unchanged run time and similar subjective quality in
comparison with HM-4.0.
Modification points: in strong filtering decision use p2 instead of p3. Strong filter is modified. Only the
upper side of deblocking filter is modified, the bottom side is the same as in HM4.0.
One proponent suggested discussing trade-off between having one more loop filter and removing one line
from line buffer after the subjective test.
Results: 0.0/0.0, 0.0/0.0, 0.0/0.0 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Recommendations: Some interest, but proponents expressed they may not be willing to join CE

JCTVC-G844 Cross-verification of MediaTek and Sony's reducing pixel line buffers by


modifying DF to R3W2 for horizontal LCU edges (JCTVC-G230) [T. Lee, J.
Chen, J. H. Park (Samsung)]

5. Deblocking filter simplifications

JCTVC-G290 Deblocking Filter Simplifications [G. Van der Auwera, M. Karczewicz


(Qualcomm)]
This contribution addresses the simplification of the deblocking filter. It is proposed to modify the weak
filter delta and the strong/weak filter decision. Modifying the weak filter delta results in BD-rate savings
of -0.2% for all-intra, -0.1% for random access, and -0.2% for low delay B. The number of strong/weak
filter decisions for an edge segment is reduced from eight to two.
This contribution addresses the simplification of the deblocking filter.
Part 1. It is proposed to modify the weak filter delta. Modifying the weak filter delta results in BD-rate
savings of -0.2% for AI, -0.1% for RA, and -0.2% for LD B. The delta calculation for the first pixel from
the block boundary is modified from (9*(q0-p0)-3*(q1-p1))/16) to (3*(q0-p0)-(q1-p1))/8 therefore
reducing number of operation and resulting in weaker deblocking. E.g., in the absence of clipping, the
blocking artifact 0 0 0 | 16 16 16 will be smoothed by the proposed deblocking to 0 2 4 | 12 14 16,
whereas current HM4.0 deblocking results in 0 3 6 | 10 13 16.
Results: -0.2/-0.2, -0.1/-0.1, -0.2/-0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.

Part 2. It is proposed to modify strong/weak filter decision by performing the decision for the 4-pel edge
based on one line. The number of strong/weak filter decisions for an edge segment is reduced from eight
to two.
It was claimed by one participant that complexity in hardware does not decrease comparing to testing two
lines like in G590. The complexity in software probably decreases.
Results: 0.0/0.0, 0.1/0.0, 0.1/0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.

Page: 93 Date Saved: 2011-12-04


Recommendations: CE

JCTVC-G929 Cross check of Qualcomm‘s deblocking filter simplifications (JCTVC-G290


[A. Kotra, M. Narroschke, T. Wedi (Panasonic)] [late]

JCTVC-G090 Non-CE12: Testing results on using HM3.0 delta calculation for luma weak
filter [M. Zhou, O.Sezer, V. Sze (TI)]
This contribution proposes to modify delta calculation for luma weak filter which has higher precision
when compared to HM4.0 one. Simulation results revealed that the proposed modification improved
coding efficiency by 0.2% in AI-HE and AI-LC, and 0.1% in RA-HE, RA-LC, LB-HE and LB-LC. The
proposed change reportedly led to visible visual gain in BQMall (RA-HE, QP=37) sequence and no visual
difference in the other CE12 selected subjective testing sequences.
Results: -0.2/-0.2, -0.1/-0.1, -0.1/-0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
The proponent does not think it is necessary to participate in subjective viewing (since the proposal is
similar to G290). No action? Only one of 290 or 090 in CE

JCTVC-G162 Non-CE12: cross-verification of JCTVC-G090 on testing results on using


HM3.0 delta calculation for luma weak filter by TI [Y. Jeon, S. Park, B. Jeon
(LG)]

JCTVC-G639 Deblocking simplification and rounding optimization [A. Kotra, M.


Narroschke, T. Wedi (Panasonic)]
Part 1: Replace deblocking for the second pixel from the block boundary by smoothing operation that
does not take into account pixels on the other side of the block boundary. The motivation is simplification
of the delta calculation. The BDR results are: AI HE -0.3%, AI LC -0.1%, small BD-rate increase (0.1%)
on LDP configurations, other configurations 0.0%
Results: -0.3/-0.1, 0.0/0.0, 0.0/0.0 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
One participant liked the idea, one participant was concerned.
One participant states it is not a real simplification in terms of number of operations; however the
contributors state that they break off various dependencies.
Part 2: Modification to deblocking rounding control. It is claimed that the current rounding in deblocking
is not completely symmetric (according to collected statistics) and it is proposed to introduce another
clipping function Clip4() with two adjustment parameters. These adjustment parameters were determined
based on statistics of the current test set. The objective gain is -0.1 on LC configurations and 0.0 on HE
configurations.
Results: 0.0/-0.1, 0.0/-0.1, 0.0/-0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Recommendation: CE (parts 1 and 2 separate)

Page: 94 Date Saved: 2011-12-04


JCTVC-G385 Cross check of Panasonic's JCTVC-G638 and JCTVC-G639 on Deblocking
[G. Van der Auwera (Qualcomm)] [late]

6. Signaling deblocking filter parameters in slice header

JCTVC-G291 Transform Dependent Deblocking Filter Parameter Adjustment in Slice


Level [G. Van der Auwera, M. Karczewicz (Qualcomm)]
This contribution proposes to adjust the Tc and Beta parameters of the deblocking filter by enabling
signaling of control data in the slice header. It is proposed to adjust Tc depending on the transform size
and on intra or inter type of the blocks. The deblocking_filter_control_present_flag is proposed in the
SPS to control the presence of the deblocking filter adjustment parameters in the slice header. There are
BD-rate gains of -0.3% for all-intra HE and -0.1% for random access HE test conditions, other
configurations similar to the anchor. The proponent advocates importance of these parameters for
subjective visual quality.
Different tc offsets are sent for blocks of different size. The control parameters are signaled in the slice
header. Similar to proposal G174 but with sending additional parameters.
Results: -0.3/-0.1, -0.1/0.0, 0.1/0.1 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Recommendation: participate in subjective viewing, revisit.

JCTVC-G982 Non-CE12 Subtest 5: Cross-check of Qualcomm's transform dependent


deblocking filter parameter adjustment in slice level JCTVC-G291 [T.
Yamakage, S. Asaka (Toshiba)] [late]

JCTVC-G619 Non-CE12: deblocking parameters signalling in slice header [A. Norkin, R.


Sjöberg, K. Andersson, J. Enhorn (Ericsson)]
The document is a proposal for signaling deblocking filter parameters in the slice header. It is proposed to
signal three deblocking parameters that control which pixels are being filtered and one clipping parameter
that determines the largest possible modification of the pixels in the weak filtering mode.
The proposal sends a flag and four parameters in the slice header:
slice_tc_offset_div2
slice_beta_offset
slice_side_beta_offset_div4
slice_nat_edge_offset_div2
The encoder that makes a rough estimation of the parameters frame-wise on SSD has been implemented.
BD-rate results are: -0.1% on AI, -0.4% on LDB HE and -0.3% on the rest of standard configurations.
The proposal can possibly participate in subjective viewing
Results: -0.1/-0.1, -0.3/-0.3, -0.4/-0.3 for HE-AI / LC-AI, HE-RA / LC-RA, HE-LD / LC-LD.
Recommendation: CE

Page: 95 Date Saved: 2011-12-04


JCTVC-G1010 Cross Check of Ericsson's Proposal JCTVC-G619 [G. Van der Auwera
(Qualcomm)] [late]

Recommendation for proposals on deblocking filtering parameters in CE12


Discuss what tests to perform and how to evaluate proposals. BoG (see under CE12.5)

7. Varying QP deblocking and deblocking for IPCM blocks

JCTVC-G384 Varying QP Deblocking [G. Van der Auwera, M. Karczewicz (Qualcomm)]


This contribution proposes changes to the deblocking filter to support the varying QP capability enabled
by delta-QP signaling. It is proposed to use the maximum of the QP values on both sides of the edge to
lookup for Beta and tC threshold parameters. In case of weak filter decision for modifying one or two
samples, it is proposed to use the QP value of the respective edge side to lookup a Beta parameter.
Related to proposal G640 part 1.
Recommendation: see below.

JCTVC-G987 Cross-check of Varying QP Deblocking (JCTVC-G384) Kenneth Andersson


[late]

JCTVC-G640 Deblocking bug fix for CU-Varying QP’s and IPCM blocks [A. Kotra, M.
Narroschke, T. Wedi (Panasonic)]
The contribution presents modifications to deblocking operations when CU-based multi-QP optimization
is enabled (Part 1) and IPCM blocks (Part 2).
Part 1:
Five modifications are proposed related to varying QP deblocking.
Individual tc and ß thresholds are derived for the blocks corresponding to an edge on which weak filtering
is performed.
Separate tc values are used in deriving the delta offsets for innermost samples in weak filtering
Separate ß values are used in decision for filtering of outermost samples in weak filtering.
It is proposed to use an average of the QP’s for deriving tc and ß values in filter decision process and
strong filtering.
Experimental results provided for a QP adaptation control that randomly assigns QP to coding units and
to CE4 testing conditions.
Part 2:
It is suggested to use a separate QP (QP_PCM) which is associated to IPCM blocks whenever deblocking
filtering is desired to be performed over IPCM blocks. QP_PCM parameter is transmitted in the slice
header.
Recommendation: see below.

Page: 96 Date Saved: 2011-12-04


JCTVC-G1013 Cross Check of Panasonic's Proposal JCTVC-G640 On Varying QP
Deblocking [G. Van der Auwera (Qualcomm)] [late]

JCTVC-G138 Deblocking of IPCM Blocks Containing Reconstructed Samples [G. Van der
Auwera, X. Wang, M. Karczewicz (Qualcomm)]
This contribution proposes a modification to the HM4.0 deblocking loop filter in case of IPCM blocks
containing reconstructed samples. HM4.0 deblocking filter always assigns quantization parameter value
zero to the IPCM blocks, therefore deblocking filtering is disabled for the left and top edges of the IPCM
blocks.
The proposal is to assign a quantization parameter value to the IPCM block, which is predicted from the
neighboring quantization group.
Recommendation: see below.

JCTVC-G883 Cross-check report of Qualcomm’s Deblocking of IPCM Blocks Containing


Reconstructed Samples (JCTVC-G138) [K. Chono, H. Aoki (NEC)] [late]

JCTVC-G793 Non-CE12: Cross check of deblocking of IPCM blocks containing


reconstructed samples (JCTVC-G138) [S. M. T. Naing, V. Wahadaniah, C. S.
Lim (Panasonic)] [late]

Summary on JCTVC-G640, JCTVC-G384 and JCTVC-G138.


G384 uses maximum of the neighbouring blocks QPs, G640 part 1 uses average of QPs.
It is not possible currently to compare G383 and G640 subjectively or objectively since they are using
different anchors (different varying QP settings).
There are certain similarities between two proposals (e.g. using separate QP values on the side of block
boundaries).
Conclusion on varying QP:
- Suggestion to use average of QPs on both sides of the edge (as in AVC) and align decide the
weak filter within one or two samples at each side of the edge.
- Develop proposed WD text on this and software in side activity (see JCTVC-G1031)

JCTVC-G1031 Support of varying QP in deblocking [G. Van der Auwera, X. Wang, M.


Karczewicz (Qualcomm), M. Narroschke, A. Kotra, T. Wedi (Panasonic)]
[late]
This contribution provides working draft changes and the corresponding software to support varying QP
capabilities in the deblocking filter, which were recommended to have during the discussions in Track A.
All changes are based on JCTVC-G640 and JVTVC-G384.
Was tested with the CE4 software (+/-12 QP variations)
Two changes are suggested:

Page: 97 Date Saved: 2011-12-04


 Average QP from both sides of the block boundary (as in AVC)
 In case of the luma weak filter, decisions on each side of the edge for modifying one or two
samples, the QP that is associated to the CU of the respective edge side is used to derive the value
β.
Decision: Adopt 1. (averaging). Average value would be used also for deriving same beta value at both
sides of the edge.
Note: Further study may be necessary on cases where one side of the edge is an IPCM block.

JCTVC-G1033 Crosscheck Report for JCTVC-G1031 by Qualcomm and Panasonic [S.


Park, B. Jeon (LG)] [late] [miss]
qq

G138 proposes deriving QP for IPCM from neighbouring blocks, G640 proposes sending QP in the slice
header for the all IPCM blocks in the slice.
Conclusion IPCM deblocking
- Further study (AHG on loop filter or new one?)

5.5.2 Adaptive loop filter


Pre-Review of contributions on ALF, SAO and combined approaches took place in a BoG (JCTVC-
G1020). Decisions were made in the track A sessions.

JCTVC-G146 Non-CE8: Information on ALF coefficient re-designing process [J. Lim, S.


Park, B. Jeon (LG)]
This document presents an investigation report on complexity reduction for ALF. The purpose of the
investigation is to reduce the ALF complexity by reducing the loop number of CU-based ALF control
redesign process. The simulation result reportedly shows that the proposed method achieves 26%/7%/5%
encoding time decrease with 0.0%/0.2%/0.4% bit rate increase for all intra HE, random access HE and
low delay B HE configurations respectively.
(non-normative encoder optimization)
Investigation of number of re-design (3,2,1 and 0) vs. coding performance. If set to 1, 1 re-design will be
required for each depth which results in 6 times scan of frame memory at encoder. Comparison with
Qualcomm's 1/2-pass code that requires less frame memory scan. 2-pass requires 1 additional scan.
No action.

JCTVC-G339 Crosscheck of LG's ALF (JCTVC-G146) [T. Ikai (Sharp)] [late]

Page: 98 Date Saved: 2011-12-04


JCTVC-G919 Non-CE8: Cross-check of JCTVC-G146 [P. Lai, F. C. A. Fernandes
(Samsung)] [late]

JCTVC-G214 Non-CE8: Constrained ALF coefficients [C.-Y. Chen, C.-Y. Tsai, C.-M. Fu,
Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0, ALF coefficients are unconstrained, which is difficult to decide the bit width of multipliers in
hardware implementations. In this contribution, center coefficients are clipped within [0, 2), non-center
coefficients are constrained within [-−1, 1), and offsets are also clipped within [−-2D, 2D), where D is the
pixel bit depth. It is reported that the constrained coefficient ranges do not cause any coding efficiency
loss or run time change.
Limits the range of integer part of ALF coefficients. Center:[0,2), other:[−-1,1), DC:[−-2D, 2D). To
specify multiplier's bit range. No need for syntax change. Add restriction on encoder/bitstream. No
detailed statistical analysis is available, but more than 99% of coefficients would not be hit by this
restriction. Reduces complexity/runtime
Recommendation from BoG: adoption.
Decision (SW): Adopt

JCTVC-G853 Crosscheck of JCTVC-G214 on constrained ALF coefficients from


MediaTek [S. Park, B. Jeon (LG)] [late]

JCTVC-G215 Non-CE8: Limited number of filters per picture for ALF [C.-Y. Chen, C.-Y.
Tsai, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek)]
In ALF of HM-4.0, up to 16 filters per picture are used for luma, and 16 filters may be too many for low
resolution applications. In this contribution, a maximum number of filters per picture can be given on the
encoder side, and the ALF encoder will merge regions or classes accordingly. Simulation results
reportedly show that when the maximum number of filters per picture is 10, no coding efficiency loss is
observed. When the maximum number of filters per picture is six, 0.1% bit rate increase is observed in
LDP. Proper values for the maximum number of filters can be discussed when common test conditions
are defined. Allowed values can be decided when profiles and levels are defined.
Two options: Make it switchable by syntax element, or just define the number by profile/level. (It was
discussed elsewhere that there may be dependency of the required number of filters on picture size, but
nobody is sure about that.)
Recommendation of track A: Have the number of filters as a parameter in the encoder conf. file, with a
default of 16.

JCTVC-G147 Non-CE8: Cross-check of MediaTek’s limited number of ALF filter by LG


(JCTVC-G215) [J. Lim, S. Park, B. Jeon (LG)]

JCTVC-G216 Non-CE8: Removing the 15th merge flag for BA mode in ALF [C.-Y. Chen,
C.-Y. Tsai, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek)]
The block-based adaptation (BA) mode of ALF in HM-4.0 classifies blocks into 15 classes. Classes can
be merged, and one filter is used for each class after merging. For class merging, only 14 merge flags are
needed. Therefore, it is proposed to remove the 15th merge flag for BA mode, as a bug-fix.

Page: 99 Date Saved: 2011-12-04


In BA mode, 15 merge flag exist in HM4/WD4. But only 14 merge flag is necessary to merge 15 filters.
Same report G610.
Decision: Recommend adoption (as a syntax bug fix).

JCTVC-G218 Non-CE8: One-stage non-deblocking loop filtering [C.-Y. Chen, C.-Y. Tsai,
C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), I. S. Chong, M. Karczewicz
(Qualcomm), T. Yamakage, T. Itoh, T. Watanabe, T. Chujoh (Toshiba)]
In this contribution, SAO and ALF are combined into one stage by dividing one picture into filter units
(FUs) and switching FUs among {ALF, SAO, OFF}, where FUs are LCU-aligned blocks coded in raster
scan in APS and the FU-based syntax is also friendly for low latency. It is reported that the proposed
method achieves 0.1%, 0.1%, 0.0%, and 0.0% bit rate reductions for HE-AI, HE-RA, HE-LDB, and HE-
LDP, respectively, with a picture-based 2-pass encoding algorithm. The proponents request to adopt this
proposal in the next CE8 for studying LCU-based syntax and corresponding low latency encoding
algorithms.

Integrates ALF and SAO in a single processing step (also keeps the options SAO only or ALF only).
Number of filters: up to 16 for the experiment. Gain 0.1/0.1/0.0/0.0, chroma loss (2%). Runtime Enc:
108/102/102/103, Dec: 104/105/105/107. Software is not mature, therefore runtime increase is observed.
Recommend further study in CE.

JCTVC-G846 Cross-Check for JCTVC-G218 [C. Kim, Y. Park(Samsung)] [late]

JCTVC-G351 Modification of ALF classification [E. Maani, M. Ikeda (Sony)]


The classification of ALF is modified to make the process more hardware friendly. More specifically,
with the proposed modification, the classification of the 4x4 block and filtering operations of the first
pixel can be done at the same time. The classification only uses the top left 3x3 subblock which is also
used for filtering of the first pixel.
Single read for classification and filtering for hardware friendly design.
Bitrate reductions are: Common: 0.1/0.1/0.1/0.1 BA reduced: 0.1/0.1/0.1/0.1 (BA classification and
calculation with inner two pixels).
Runtimes Enc Common: 100/100/100/100 BA reduced: 100/100/100/100 Dec 98/98/100/99 96/99/99/99.
Relationship with G609 which only does BA classification with inner two pixels (partially adopted from
CE8). Recommend further study in CE

JCTVC-G803 Non-CE8: Cross-check for Simplification of ALF Classification [T.


Yamamoto, T. Ikai (Sharp)] [late]

JCTVC-G1045 Cross-check of G382 (Quantization with Adaptive Reconstruction Levels)


set-2 modifications David Flynn (BBC) [late]

Page: 100 Date Saved: 2011-12-04


JCTVC-G380 Non-CE8: Coding tree level signaling of alf_cu_flag [T. Nishi, K.
Uchibayashi (Panasonic)]
In this contribution, a coding tree level signaling of alf_cu_flag was proposed instead of the signaling
using the adaptive loop filter coding unit control parameter syntax in the slice header. In the current
HEVC WD, ALF process requires a buffer memory to store the flags in a decoder implementation. In the
proposed method, in order not to require the buffer memory, alf_cu_flag is coded in the coding tree
syntax, and thus the flag can be discarded right after the relevant block level processing. The proposed
method was evaluated on the top of HMv4.0, and the difference in the objective performance was
reportedly negligible.

Signal alf_cu_flag at each cu. This eliminates a buffer to store alf_cu_flag (16kBytes for 4K2K). Coding
results were obtained by an encoder with the current frame-based design of ALF coefficients. This
approach is claimed to be friendly with low latency coding.
Coding gain and runtime unchanged.
Comments by experts: The flags can be stored in external memory, where memory bandwidth increase is
negligible. If this proposal claims friendliness with low latency encoding, encoder should use sub-
optimal filter coefficients that were designed at the previous frames. A question was raised whether the
transmission of bitstream can be started right after an LCU is coded, or needs to wait for a slice to be fully
coded. The approach may not be friendly for parallel processing (wavefront, slice boundary filtering).
No consensus, one non-proponent company supports it, but there are several statements against it. Leave
as is, no action.

JCTVC-G938 Cross-check of JCTVC-G380, "Non-CE8: Coding tree level signaling of


alf_cu_flag" [A. Fuldseth (Cisco)] [late]

JCTVC-G499 Improved ALF with low latency and reduced complexity [A. Fuldseth, G.
Bjøntegaard (Cisco)]
The contribution proposes a low complexity ALF technique with support for sub-frame encoder delay.
One single set of ALF filter coefficients are transmitted for each LCU using single-pass estimation. This
supports low encoder-side delay by allowing for estimation and signaling of the ALF coefficients on the
LCU level without aggregating pixel date of the whole frame. The proposed ALF uses either 5x5
diamond shape or 9x3 cross shape and does not require decoder-side variance calculations. The absence
of decoder-side variance calculations represents a significant reduction in decoder complexity. When
applied to low complexity configurations, BD-rate gains between 1.6 % and 4.0% are reported. When
applied to high efficiency configurations, BD-rate results between -0.2% and 0.4% are reported. For high
efficiency configurations, encoding times vary between 67% and 94%, while decoding times vary
between 91% and 94%.

Improvement of CE8 proposal, G498. At each LCU, a filter is designed (or previously designed filter is
used). A flag to indicate control of filtering is signalled. Suitable for single-pass encoders. Proposed to
adopt this in both HE and LC. Two filter shapes (Snowflake as is, cross-shape with 3 pel vertical height).
Up to 16 coefs. stored for a slice (as a kind of dynamic codebook), one of which is selected for a LCU, or
a new one is designed
The software has been uploaded to the JCT-VC site on Nov. 22.
Luma Bitrate: HE: 0.2/0.4/0.0/-0.2 LC: 1.6/2.5/2.3/4.0 (AI/RA/LBLP)
Runtime: Enc HE: 67/91/94/90 LC: 106/102/101/101 Dec HE: 91/92/94/94 LC: 114/113/111/113

Page: 101 Date Saved: 2011-12-04


A non-proponent preferred this. Comment: There seems to be higher loss in higher resolution. Runtime
reduction may come from not applying ALF to chroma. Same discussion as G380 regarding transmission
timing.
Syntax and estimation process LCU based, no pixel based classification, no chroma filter.
Largest loss RA in class A (1.1%, little more on steam locomotive)
Combination with wavefront processing needs to be investigated.
Very interesting proposal. Investigation in CE. The CE should also investigate more drastic simplification
of current ALF, e.g. removal of pixel-based classification.
There was some discussion whether low latency criteria should be established.
(Group decided that no software branch should be created)

JCTVC-G651 Crosscheck of JCTVC-G499 - Improved ALF with low latency and reduced
complexity [M. Budagavi (TI)] [late]

JCTVC-G445 Removing DC component of ALF filter coefficients [A. Minezawa, K.


Sugimoto, K. Miyazawa, S. Sekiguchi (Mitsubishi)]
In this contribution, a modification to adaptive loop filtering is proposed. In the proposed scheme, DC
offset coefficient for adaptive loop filtering is removed. This modification is implemented onto HM-4.0
software, and simulations are conducted using common test configurations to evaluate the performance of
the scheme. It is reported that the proposed scheme shows 0.1% BD-rate achievement on average with
reduction of encoding run-time.

Remove DC offset of ALF. SAO and ALF DC offset may have overlapping effect. This comes with a
loss of class E (0.4% luma, more for V channel).

JCTVC-G918 Non-CE8: Cross-check of JCTVC-G445 on removing DC component of ALF


filter coefficients [P. Lai, F. C. A. Fernandes (Samsung)] [late]

JCTVC-G615 Non-CE8: Cross check of Mitsubishi's removing DC component of ALF


filter coefficients [I. S. Chong, M. Karczewicz (Qualcomm)]

JCTVC-G774 Non-CE8: Modified ALF DC coefficients [T. Yamakage, T. Watanabe, T.


Chujoh (Toshiba), C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei
(MediaTek), M. Karczewicz, I. S. Chong (Qualcomm)]
This evaluates a modified ALF DC coefficients. This technique uses less bit-depth for ALF DC
coefficients. Coding efficiency gain for luma is 0.0 %, 0.1 %, 0.1% and 0.1 % in HE-AI, RA, LB, and LP
without encoding/decoding time increase on average.

Use lower precision to signal DC coefficient (2 bits instead of the current 9 bits).

Page: 102 Date Saved: 2011-12-04


JCTVC-G892 Non-CE8: Cross-check of JCTVC-G774 on modified ALF DC coefficients P.
Lai, F. C. A. Fernandes (Samsung) [late]

JCTVC-G610 Improvements to ALF [J. Zhao, A. Segall (Sharp)]


This contribution proposes three simplifications and improvements for signaling Adaptive Loop Filter
coefficients.
First, a switchable option of not sending DC coefficient is introduced. It is asserted that this reduces the
complexity of the ALF process and improves coding efficiency. Using common conditions, the average
bit rate reduction is -0.1%, -0.4%, -0.4% for Y,U,V components respectively for AI, -0.1%, -0.4%, -0.5%
for RA configuration, and 0.0%, 0.1%, 0.0% for LD.
Second, it is proposed to predict the luma center coefficient from other coefficient when inter filter
prediction is not used. This harmonizes the coding of the center coefficient between the luma and chroma
filters. It is reported that the method provides an average -0.1% rate reduction for luma for RA and LD
configurations
Third, it is proposed to use fixed k tables for sending luma filter coefficients. In HM4.0, the k value of the
Golomb code to decode luma filter coefficients is adapted on a slice-by-slice level. Using a fixed set of k
values removes the need for the adaptation and transmitting/extracting the k values from the bitstrem.
Result show that there is no coding efficiency loss by using fixed k tables.
Was item 3 discussed in BoG? No, but there is some understanding that this is preliminary work
Use lower precision to signal DC coefficient (2 bits instead of the current 9 bits).
This proposal also includes reports of two bug fixes:
1. Suggests to remove the k values from entries 0 to 4 in alf_golomb_index_bit, which are not used
any more in the case of cross-shaped filter - agreed
2. is identical as suggested by G216, and was agreed there

JCTVC-G952 Non-CE8: Cross-check of JCTVC-G610 on improvements to ALF [P. Lai, F.


C. A. Fernandes (Samsung)] [late]

JCTVC-G998 Combined Result of JCTVC-G610 and JCTVC-G774 [J. Zhao, A. Segall


(Sharp), M. Karczewicz, I. S. Chong (Qualcomm)] [late]
JCTVC-G610 proposes modifications to the ALF process consisting of (i) selectively sending DC
coefficients, (ii) harmonization of center coefficient signaling between luma and chroma, and (iii) reduced
signaling of the k values used to transmit ALF coefficients. It is asserted that these modifications
simplify ALF and also provide coding efficient improvement. JCTVC-G774 also proposes modifications
to the ALF process consisting of reducing the bit-depth of DC coefficient values. It is also asserted that
this modification simplifies ALF and provides coding efficiency improvement.
In this proposal, the combination of JCTVC-G610 and JCTVC-G774 is reported. The proposal reports
that the combined solution provides additional coding efficiency gains. Moreover, it is asserted that the
simplifications of each proposal are maintained and additive.

Conclusion of BoG from G445, G774, G610, G998:

Page: 103 Date Saved: 2011-12-04


Necessity of DC offset. In some cases (in particular, high quality coding cases), DC offset will work (at
least 0.4% gain) while it does not always work. Therefore, it is suggested to create a switch to
enable/disable DC offset.
Suggestions: a) removal (G445) b) Make DC offset switchable and predictive coding of center luma
coefficient (G610) c) Reduce precision of offset value (from spatial neighbors, G774)
Decision: Adopt G445 (remove DC offset). No CE on the other.
The prediction of the center coefficients from G610 is a variant of a subset of G665 which was adopted.

JCTVC-G615 Non-CE8: Cross check of Mitsubishi's removing DC component of ALF


filter coefficients [I. S. Chong, M. Karczewicz (Qualcomm)]

JCTVC-G446 Reduction of the number of pixels used in Adaptive Loop Filter [K.
Miyazawa, K. Sugimoto, A. Minezawa, S. Sekiguchi, T. Murakami
(Mitsubishi)]
This contribution proposes a method for reducing the number of pixels used in Adaptive Loop Filter
(ALF). Prior to applying ALF, the proposed technique calculates a score predicting the effectiveness of
ALF for each pixel, then, skips all ALF processes (e.g. pixel classification, filter design, filter apply) for
the pixels whose scores are less than a threshold. This threshold is adaptively determined for each frame
in the encoding process, and is transmitted to a decoder. The simulation results report that the proposed
method achieves 2% ~ 9% encoding time reductions and 1% ~ 2% decoding time reductions, with 0.1% ~
0.2% BD-rate loss for AI-HE, RA-HE, and LD-HE structures.
Complexity reduction of ALF by skipping area of ALF pixels by pre-analyzing a cost for 4x4 block.
ALF-skipped pixels is 40%, loss 0.1/0.2/0.2/xx
Comment: alf_cu_flag may do similar thing with more burden.
Helps on average complexity, but hurts worst case (unless we would limit percentage of ALF use which
might incur additional problems). No action.

JCTVC-G920 Cross-check of Mitsubishi's reduction of the number of pixels used in


Adaptive Loop Filter JCTVC-G446 [T. Yamakage, T. Itoh (Toshiba)] [late]

JCTVC-G463 Block-based filter adaptation with intra prediction mode and CU depth
information [S. Wang, S. Ma, J. Jia (LG)]
This contribution presents a simplification of block-based filter adaptation (BA) scheme for intra slice.
For each 4x4 block, features are obtained from intra picture prediction mode and CU depth. So that filter
adaptation is free from calculation of direction and Laplacian features. The proposed method reports an
average 2% decoding time reduction, with 0.1% BD-rate loss for AI structures in comparison with the
HM 4.0.

Simplification of filter adaptation in BA-ALF for intra slices. Intra prediction modes, CU depth
information and PU sizes are used instead of calculating directional and Laplacian features. Only for intra
slices? Yes. 0.1% loss is reported for intra.
May become complicated to implement a different adaptation process just for intra (additional functions
or circuitry are necessary), whereas the worst case run time must be supported anyway, so saving
computation for intra only may not be helpful. No action.

Page: 104 Date Saved: 2011-12-04


JCTVC-G835 Cross verification of LG’s Block-based filter adaptation with intra
prediction mode and CU depth information (JCTVC-G463) [Y. Chiu, L. Xu
(Intel)] [late]

JCTVC-G656 Non-CE8 Subtest a: Harmonization of CE8a Tool 2 Shape-dependent BA


(SDBA) and Tool 3 Block-based filter adaptation with up to 8 filters (HV8)
[P. Lai, F. C. A. Fernandes, I.-K. Kim (Samsung)]
This contribution presents a filter adaptation method which harmonizes CE8a Tool 2 “Shape-dependent
Block-based Filter Adaptation (SDBA)” and CE8a Tool 3 “Block-based filter adaptation with 8 initial
filter classes (HV8)”. For BA, computation of features along diagonal directions or horizontal/vertical
directions is coupled with the filter shape (snowflake or cross), such that there are the same numbers of
“classification + shape combinations” to be evaluated at encoder, as compared to HM4.0. For
classification, unified and reduced number of initial filter classes to 8 for both BA and RA is achieved by:
For BA, directly comparing BA directional features to produce 2 directional classes, each with 4
magnitude levels. For RA, a frame is partitioned into 8 regions corresponding to 8 initial filter classes.
The harmonized method reports 0.1% BD-rate gain for AI, RA, LDB; and 0.2% BD-rate gain for LDP.
Reviewed in context of CE8a

JCTVC-G338 Non-CE8.a: Crosscheck of Non-CE8 Subtest a: Harmonization of CE8a Tool


2(SDBA) and Tool 3(HV8) (JCTVC-G656) [T. Ikai (Sharp)]

JCTVC-G666 1D- DCT based frequency domain adaptive loop filter (FD-ALF) [Jeongyoen
Lim, Ju Ock Lee, Hae-Kwang Kim, Joo-Hee Moon]
A frequency domain adaptive loop filtering (FD- ALF) method on the basis of 1D DCT is proposed in
this document. The purpose of this contribution is reducing computational complexity of ALF while
minimizing coding efficiency loss. The basic scheme of FD-ALF follows the current ALF in HM4.0
except that the filtering is applied in 1D DCT frequency domain. FD-ALF is adaptively applied on RA or
BA classification mode and can be controlled on CU unit basis just as same way as the existing ALF.
DCT domain filtering is processed by multiplication operation of DCT domain filter to the DCT
transformed reconstructed picture after SAO rather than the convolution operation of current pixel based
ALF filter. Different 1D DCT filters are characterized by its tap size (8 tap, 16 tap), its direction
(horizontal or vertical) and the coefficients of each of 1D DCT filters. The coefficients of 1D DCT filters
are obtained by MSE (Mean Square Error) optimization method between the original picture and the
reconstructed picture.
Using common conditions, the average bit rate reduction is +1.2% for Y components for high efficiency
AI, +2.2% for Y components for high efficiency RA, +1.8% for Y components for high efficiency LDB,
and +2.8% for Y components for high efficiency LDP. Encoder Time is 83% for high efficiency AI, 97%
for high efficiency RA, 98% for high efficiency LDB, and 96% for high efficiency LDP. Decoder Time is
94% for high efficiency AI, 91% for high efficiency RA, 94% for high efficiency LDB, and 92% for high
efficiency LDP.
Presented Friday afternoon. No bit rate reduction, loss.
No action.

Page: 105 Date Saved: 2011-12-04


JCTVC-G691 Non-CE8: Combination of CE8.a.1 and CE8.a.2 [T. Yamakage, T.
Watanabe, T. Chujoh (Toshiba), C. Y. Chen, C. M. Fu, C. Y. Tsai, Y. W.
Huang, S. Lei (MediaTek), M. Karczewicz, I. S. Chong (Qualcomm), T. Ikai,
A. Segall, T. Yamamoto (Sharp)]
This evaluates a combination of CE8.a.1 and CE8.a.2. The coding efficiency is improved by is 0.2 %, 0.3
%, 0.3% and 0.4 % in HE-AI, RA, LB, and LP.
Reviewed in context of CE8a

JCTVC-G653 Non-CE8: Cross-check of JCTVC-G691 Combination of CE8.a Tool 1 and


Tool 2 [P. Lai, F. C. A. Fernandes (Samsung)]

JCTVC-G813 ALF with single filter type [M. Budagavi (TI)]


In HM-4.0, two types of ALF filters are used – cross and star. This contribution presents BD-Rate results
when only one of the two filters is used. It is asserted that using only cross results in a BD-Rate loss of:
AI-HE: 0.08%, RA-HE: 0.32%, LB-HE: 0.40%; using only star results in a BD-Rate loss of: AI-HE:
0.16%, RA-HE: 0.41%, LB-HE: 0.58%. It is asserted that from an encoder complexity reduction
perspective a single filter type that can provide most of the gains of two filter types is desirable since it
reduces the number of encoder passes. A single filter type of cross filter with rectangular center was
reported in JCTVC-G130 to capture most of the gains of chroma ALF with two filter types. This
contribution proposes that a similar filter be evaluated for luma.
Test to use only CROSS or STAR shape.
Results (negative means losses): No ALF: -1.9/-3.7/-3.1/xx; Only Star: -0.2/-0.4/-0.6/-0.3; Only Cross: -
0.1/-0.3/-0.4/-0.7.
Comments:
If single filter is adopted, is it possible to use more coefficients?
Slice boundary processing becomes simpler. Gain/loss highly depends on the sequence.
Study in CE part on simplification of ALF design.

JCTVC-G1012 Non-CE8: Supplementary results to JCTVC-G813 regarding ALF using


one filter type (Luma only) [P. Lai (Samsung), H. Guermazi (eBrisk Video)]
[late]
Not considered (late/missing when the topic was discussed)

JCTVC-G923 ALF decoding time reduction by adopting a simple SIMD code (Informative)
[T. Yamakage, T. Itoh (Toshiba)] [late]
This contribution informs about decoding time reduction of ALF by adopting a simple SIMD code.
By implementing ALF (BADIR classification and ALF filtering) with a simple SIMD code, the decoding
time is reduced to 9% on average. This SIMD code can be compiled with Microsoft (R) Visual Studio
(R) 2010, and uses Intel (R) SSE4.1. Results were cross-checked by JCTVC-G954.
Information by implementing a simple SIMD code
Process 8 pixels in parallel.
10% less decoding time by SIMD code (i.e., ALF time became 1/3)
Just for information – no action

Page: 106 Date Saved: 2011-12-04


JCTVC-G954 Non-CE8: Cross check of ALF decoding time reduction by adopting a simple
SIMD code (JCTVC-G923) [I. S. Chong, M. Karczewicz] [late]

5.5.3 Sample adaptive offset

JCTVC-G222 Non-CE8: Offset coding in SAO [C.-M. Fu, Y.-W. Huang, S. Lei
(MediaTek), I. S. Chong, M. Karczewicz (Qualcomm)]
The offset coding in SAO of HM-4.0 does not fit the offset distribution. In this contribution, a new
codeword design can be used to better fit the offset distribution, or an offset prediction technique can be
used to reduce offset information. Simulation results reportedly show that the luma bit rate and the
chroma bit rate are reduced by 0-0.1%, and 0.1-0.5%, respectively, with unchanged run time.

A prediction technique to reduce offset information.


Bitrate reduction HE: 0.0/0.0/0./0.0 LC: 0.0/0.1/0.1/0.1 (little more for chroma)
Comment: Predictor seems to depend on Qp.

JCTVC-G818 Cross-check for Offset coding in SAO from MediaTek and Qualcomm by
Samsung [E. Alshina, J.H. Park] [late]

JCTVC-G490 Modified SAO edge offsets [K. Andersson, P. Wennersten, R. Sjöberg


(Ericsson)]
In this contribution SAO edge offsets in HM4.0 are modified to be more specific to edge characteristics.
The result indicates that edges are enhanced both subjectively and objectively. The average BDR gain
versus HM4.0 is reported as 0.2% for luma and 1% for chroma for the common conditions. With YUV
BDR (6Y+U+V)/8, the average YUV BDR gain is 0.4%. Best results are obtained for the optional
condition low delay P at low complexity where average luma BDR is improved by 1.1%. The encoding
time is similar to HM4.0. The average decoding time on a single machine is increased by 1.8% compared
to HM4.0.
By adding the possibility to use more specific edge offsets when needed both the subjective and the
objective quality can be improved. Current 4 constrained edge offset or new 10 unconstrained edge
offset.
Bitrate reduction HE: 0.0/0.0/0.1/0.1 LC: 0.0/0.2/0.5/1.1 (more for chroma)
Comments:
 Small complexity increase.
 Potential increase of offset values. Encoder needs to be a little bit more complicated.
 Most of the gain comes from adding 10 offsets.
 A concern about extra cost / switching.

JCTVC-G827 Non-CE8: Crosscheck for Ericsson's modified SAO edge offsets in JCTVC-
G490 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]

Page: 107 Date Saved: 2011-12-04


JCTVC-G680 Non-CE8: Method of visual coding artifact removal for SAO [W.-S. Kim, D.-
K. Kwon (TI)]
Sample adaptive offset (SAO) usually improves overall subjective quality. However, it can cause visual
artifact occasionally by adding offset value determined globally in the given region to pixels of which
statistics are very different from that of global ones. For example, in class E sequence Vidyo 1, “pepper
and salt” type of coding artifact has been observed. In this contribution the cause of such a coding artifact
is analyzed, and a method to remove this artifact is proposed. In the proposed method the sign value of
edge offset is restricted according to edge category, so that contrast between the current and neighboring
pixels is not increased after adding the offset. Subjective tests reveal that the SAO visual artifact is
removed by employing the proposed method. Experimental results also show that the proposed change
does not compromise overall subjective quality and coding efficiency (Average BD-rate increase of 0.0%
for luma and 0.2% for chroma).

Subjective picture quality improvement on SAO.


The sign value of edge offset is restricted according to edge category, so that contrast between the current
and neighboring pixels is not increased after adding the offset. (Basically, only low-pass non-linear
filtering is only applied.)
Still images are shown during presentation, and the improvement is recognized by participants.
BR reduction HE: 0.0/0.0/0.0/0.0; LC: 0.0/0.0/-0.1/0.0
Comments from BoG:
Rationale of having sign for offset was debated. There should be a non-normative way to do the same
thing. The proposal is supported by several participants, while the original SAO proponent expressed
some concern about the immediate change of the specification. It is suggested to conduct subjective
viewing of moving pictures.
Can be done encoder-only, but then sign would be useless.
Did the BoG recommend adopting the encoder-only action? In principle yes, but nobody has tested or
implemented it. Further investigate in context of CE (both normative and non-normative solution).

JCTVC-G808 Cross-check for TI's proposal (JCTVC-G680) [T.Matsunobu, T.Sugio,


T.Nishi (Panasonic)] [late]

JCTVC-G915 Coding and selection of SAO parameters [D. Baylon, K. Minoo (Motorola
Mobility)] [late]
In HM4.0, the determination of the SAO offset value is based upon only distortion considerations. This
contribution proposes to determine the offset based on RD considerations. In addition, this contribution
proposes to increase the number of EO classes to eight. Simulations over 30 frames reportedly show no
significant loss in luma coding for HE and LC, while a bit rate savings of up to 1.2% for chroma HE, and
1.1% for chroma LC.

Contribution proposes to determine the offset based on RD considerations. In addition, this contribution
proposes to increase the number of EO classes from four to eight. Change in coding of BO type. (HE
case)
Short-length results: HE: 0.0/0.0/0.0/0.0 luma LC: 0.0/0.0/0.0/-0.1 luma Chroma: 0.5 to 1.1% gain
Encoder optimization: supported by a participant, but wait for cross-check results.

Page: 108 Date Saved: 2011-12-04


Recommendation from breakout to adopt the enc opt part, but conditional on cross-check
A CE is suggested on various proposals of SAO offset adaptation which slightly reduce the rate but also
appear to slightly increase complexity – is that worthwhile? It is emphasized by the chair of track A that,
if the results indicate only minor benefit, these parts of the CE may not be considered/reviewed in next
meeting (note: this could apply to some other CEs as well and would just mean to execute rules that were
already set up previously).CE on various proposals of SAO offset adaptation that slightly reduce the rate
but also slightly increase complexity – is that worthwhile?
Put a similar note here as was put for some other CE that if the results are minor it may not be
considered/reviewed in next meeting.

JCTVC-G1014 Non-CE8: Crosscheck for Motorola's coding and selection of SAO


parameters in JCTVC-G915 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]
It is confirmed that the non-normative part provides the intended fix in the RD optimization and achieves
the expected BD rate gain (0.3-0.4% for chroma). Decision (SW): Adopt the non-normative element of
G915.

JCTVC-G246 On additional SAO Band Offset classifications [G. Laroche(Canon), T.


Poirier, P. Onno (Canon)]
This contribution presents a modification of sample adaptive band offsets. The proposed modification
consists in adding 8 classifications of band offsets. The additional classifications are based on the first
group of sample adaptive band offsets. These additional classifications are the combination of 3 interval
sizes and 3 sizes of class. Moreover, the center of the proposed intervals in the intensity range is coded. In
the experimental results, two configurations are presented with respectively 6 and 8 additional classes.
The proposed modification gives an average BDR of -0.4% for PSNR AV, -0.1% for Luma component, -
1.7% for Chroma U and -2.0% for Chroma V with a small impact on the encoding runtime and with
complexity reduction of the decoding runtime.

Adds 8 Band Offset classifications to the 2 current HM4.0 BO groups. In particular, 3 additional
intensity subdivision ranges are added. Worst case complexity does not increase since it comes from EO.
Decrease of decoding time by selecting BO more frequently. New RDO selection for all SAO
classification.
Another test is add 6 Band Offset classification.
Main proposal: HE: 0.0/0.0/0.1/0.1 luma LC: 0.0/0.1/0.2/0.1 luma Chroma: 0.7 to 2.7% gain (average
1.7U, 2.0V)
Additional test: HE: 0.0/0.0/0.1/0.1 luma LC: 0.0/0.1/0.2/0.1 luma Chroma: 0.5 to 2.2% gain (aver.??)
Comments: Chroma only change? No, both luma and chroma for consistency. Signal the center band
for every region.

JCTVC-G828 Non-CE8: Crosscheck for Canon's additional SAO band offset classifications
in JCTVC-G246 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]

JCTVC-G682 Non-CE8: Reduced number of band offsets in SAO [W.-S. Kim, D.-K. Kwon
(TI)]
In the current HM-4.0 design SAO parameters are encoded into adaptation parameter set (APS), and need
to be stored in a buffer until SAO process is completed for each picture. The buffer size is proportional to
the number of SAO partitions and size of SAO parameters for each partition. In this contribution the

Page: 109 Date Saved: 2011-12-04


number of band offsets is reduced from 16 to 8, and number of bands is increased from 2 to 4. With this
scheme, the number of band offsets that a decoder needs to buffer can be effectively reduced from 16 to
8. The experimental results and subjective viewing show that the proposed scheme reduces the SAO
parameter buffer size on the decoder side by half without degrading subjective and objective quality. The
measured BD-rate reduction is 0.1% in LB-LC, and 0.0% in all the other configurations.

The number of band offsets is reduced from 16 to 8, and number of bands is increased from 2 to 4. The
purpose is to reduce the memory to store offsets.
Proposal 1: 4 band, 8 offsets
Proposal 2: The coverage of offsets of the first and the last sub-band are extended to the pixels outside the
band.
Main proposal: HE: 0.0/0.0/0.0/0.0 luma LC: 0.0/0.0/0.1/0.0 luma Chroma: 0.1 to 0.5% gain
Extended proposal: HE: 0.0/0.0/0.0/0.0 LC: 0.0/0.0/0.1/0.0
Comments:
It is suggested that number of offsets could also be imposed by level constraints.
Encoder must perform more depth.
Note: G218 (unified ALF/SAO) might also solve this problem
Conclusion: Conduct CE on G246, G682 (SAO simplifications)

JCTVC-G748 Non-CE8: Crosscheck of TI's Reduced number of band offsets in SAO [T.
Ikai (Sharp)] [late]

JCTVC-G309 Modifications of offset classifications in SAO [Y. Yasugi, T. Ikai (Sharp)]


This contribution describes modifications of offset classification in SAO (Sample Adaptive Offset). In
this proposal, offset classification tables are modified for both offset types: Edge Offset (EO) and Band
Offset (BO). For EO types, the fifth offset is employed on pixels which edgeIdx is 2. For BO types, a
new band-classification table is introduced for chroma values. It is reported that the proposed method
improves coding efficiency of chroma channels by averages of 0.3%, 0.4% and 0.5% for IO_HE, RA_HE
and LB_HE condition respectively.

For EO types, the fifth offset is employed on pixels which edgeIdx is 2. For BO types, a new band-
classification table is introduced for chroma values.
HE: 0.0/0.0/0.0/xx luma LC: 0.0/0.0/0.0/xx luma Chroma: 0.3 to 0.5% gain in chroma (HE cases).
No action.

JCTVC-G483 Non-CE8: Crosscheck for SHARP's modifications of offset classifications in


SAO in JCTVC-G309 [C.-M. Fu, Y.-W. Huang (MediaTek)] [late]

Page: 110 Date Saved: 2011-12-04


JCTVC-G831 Non-CE8: Sample Adaptive Offset with LCU-based Syntax [C.-Y. Chen, C.-
M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek), M. Karczewicz, I. S.
Chong (Qualcomm)]

In HM-4.0, sample adaptive offset (SAO) parameters are coded for each region in a picture. In order to
support localization of SAO parameters with higher flexibility, this contribution proposed a new syntax
that allowed SAO parameters to be adaptively changed at any largest coding unit (LCU). Simulation
results reportedly showed that the proposed syntax caused 0.0%, 0.1%, 0.1%, 0.1%, 0.0%, 0.1%, 0.2%
and 0.2% bit rate increases for HE-AI, HE-RA, HE-LB, HE-LP, LC-AI, LC-RA, LC-LB, and LC-LP,
respectively, with almost the same encoding and decoding times when the algorithm of deriving localized
SAO parameters was unchanged.

Motivation: It is desirable to develop a simple syntax that can support many different picture partitioning
algorithms for SAO optimization.
Allow SAO parameters to be adaptively changed at any largest coding unit (LCU). SAOP can be
signalled LCU by LCU or can be copied from left or above LCU or above LCU line. Prediction of offset
is also applied.
Loss: HE: xx/-0.1/-0.1/-0.1 LC: xx/-0.1/-0.2/-0.2
Encoder implementation can be (1) same design as HM4 with the LCU-based syntax, or (2) LCU by LCU
level encoder that may lose some coding gain.
Note: G218 also would enable LCU-based adaptation. However, it may be implementation specific
whether combined SAO and ALF is desirable, as some implementations may rather combine de-blocking
and SAO. Therefore, include in same CE part as G218.

JCTVC-G935 Cross-check for LCU based SAO from MediaTek and Qualcomm (JCTVC-
G831) by Samsung [E.Alshina, J.H.Park (Samsung)] [late]

5.5.4 Combined approaches

JCTVC-G211 Non-CE8.c.6: Multi-source SAO and ALF virtual boundary processing with
cross9x9 [C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei
(MediaTek), S. Esenlik, M. Narroschke, T. Wedi (Panasonic), I. S. Chong, M.
Karczewicz (Qualcomm)]
In HM-4.0, SAO requires 0.2 luma pixel line buffer (PLB) and 0.2 chroma PLB, and ALF requires 4.1
luma PLBs and four chroma PLBs. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing
was proposed to remove all line buffers for ALF and SAO, respectively. In JCTVC-F272, multi-source
SAO and ALF were proposed to reduce line buffers. In JCTVC-G206, the three prior proposals and using
snowflake5x5 and cross9x9 ALF shapes are combined as CE8.c.4-2 to remove all SAO and ALF line
buffers, and this contribution further improves the visual quality of CE8.c.4-2 without increasing any line
buffer. Due to the DF in HM-4.0, the luma VB and the chroma VB are set as three pixels and one pixel
above the horizontal LCU boundary, respectively, and processing each pixel on one side of a VB avoids
any data access from the other side of the VB unless the data can become available in time without using
any additional line buffer. When compared with the JCTVC-F900 anchor, this proposal reportedly
improves coding efficiency and achieves 0.0%, -0.2%, -0.5%, and -0.2% BD-rates for HE-AI, HE-RA,
HE-LDB, and HE-LDP, respectively, and the same visual quality. No VB artifact is observed.
Already discussed in context of CE8.c

Page: 111 Date Saved: 2011-12-04


JCTVC-G477 Non-CE8.c.6: Cross-check of MediaTek/Panasonic/Qualcomm's multi-
source SAO and ALF virtual boundary processing with cross9x9 JCTVC-
G211 [T. Yamakage, T. Watanabe (Toshiba)]

JCTVC-G212 Non-CE8.c.7: Single-source SAO and ALF virtual boundary processing with
cross9x9 [C.-Y. Chen, C.-M. Fu, C.-Y. Tsai, Y.-W. Huang, S. Lei
(MediaTek), S. Esenlik, M. Narroschke, T. Wedi (Panasonic), I. S. Chong, M.
Karczewicz (Qualcomm)]
In HM-4.0, SAO requires 0.2 luma pixel line buffer (PLB) and 0.2 chroma PLB, and ALF requires 4.1
luma PLBs and four chroma PLBs. In JCTVC-F054 and JCTVC-F055, virtual boundary (VB) processing
was proposed to remove all line buffers for ALF and SAO, respectively. In JCTVC-G206, the two prior
proposals and using snowflake5x5 and cross9x9 ALF shapes are first combined as CE8.c.4-1 to remove
all SAO and ALF line buffers, and this contribution provides two solutions to improve the visual quality
of CE8.c.4-1. Due to the DF in HM-4.0, the luma VB and the chroma VB are set as four pixels and two
pixels above the horizontal LCU boundary, respectively, and processing each pixel on one side of a VB
avoids any data access from the other side of the VB. Non-CE8.c.7-1 gives up SAO VB processing and
has SAO line buffers, while non-CE8.c.7-2 applies SAO VB processing method that can reduce 50%
SAO line buffers. Both solutions apply ALF VB processing that can remove all ALF line buffers. When
compared with the JCTVC-F900 anchor, both solutions reportedly improve coding efficiency and achieve
0.0%, -0.1%, -0.3%, and -0.0% BD-rates for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, and
the same visual quality. No VB artifact is observed.
Already discussed in context of CE8.c

JCTVC-G337 Non-CE8.c.7: Crosscheck of Single-source SAO and ALF virtual boundary


processing (JCTVC-G212) [T. Ikai (Sharp)]

JCTVC-G654 Non-CE8: Cross-check of CE8.c Tool 7-1 (JCTVC-G212) on single-source


SAO and ALF virtual boundary processing with cross9x9 [P. Lai, F. C. A.
Fernandes (Samsung)]

JCTVC-G220 Non-CE8: Pure VLC for SAO and ALF [C.-Y. Tsai, C.-M. Fu, C.-Y. Chen,
C.-W. Hsu, Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0-dev-miscs, SAO and ALF parameters in APS and CU-level ALF-on/off flags in slice header
can be coded by CABAC. No other syntax elements in APS and slice header can be coded by CABAC. In
this contribution, it is proposed to use pure VLC for SAO and ALF and to remove byte alignment bits for
CABAC in APS. Simulation results reportedly show 0%, 0.1%, 0.2% and 0.2% coding efficiency gains
for HE-AI, HE-RA, HE-LDB, and HE-LDP, respectively, when the APS coding is changed from CABAC
to VLC. Simulation results also show no coding efficiency impact when the slice header coding is
changed from CABAC to VLC.

Use pure VLC for SAO and ALF and to remove byte alignment bits for CABAC in APS.
Use pure VLC for APS 0.0/0.1/0.2/0.2 Use pure VLC for slice header 0.0/0.0/0.0/0.0
From BoG: Many participants support this.

Page: 112 Date Saved: 2011-12-04


Decision: Adopt

JCTVC-G617 Non-CE8: Cross check of MediaTek's pure VLC for SAO and ALF [I. S.
Chong, M. Karczewicz (Qualcomm)]

JCTVC-G978 Cross check of syntax refinements for SAO and ALF in JCTVC-G566 [J.
Tanaka, T. Suzuki (Sony)] [late]

JCTVC-G608 Unified Deblocking and SAO [A. Segall, J. Zhao (Sharp)]


In current HEVC test model HM4.0, there are three in-loop filtering processes ― deblocking filter,
Sample Adaptive Offset (SAO) and Adaptive Loop filter (ALF). They are processed sequentially. SAO is
applied to the debocked pictures. Due to dependency on future deblocked pixels, SAO can not be done
immediately after a pixel is fully deblocked. This makes it hard to integrate SAO into parallel deblocking
process.
This contribution proposes a simple modification to SAO classification process ― use horizontal
deblocked pixels instead of fully deblocked pixels for EO type SAO classification. This eliminates the
dependency of SAO of a pixel on future deblocked pixels, and makes it possible to integrate SAO into
parallel deblocking process. On average, there is only about 0.1% rate increase on coding performance.

If an SAO offset is known, SAO process is simply an add operation, it would be desirable to be added to a
pixel while the pixel is deblocked.
Use horizontal deblocked pixels instead of fully deblocked pixels for EO type SAO classification. For the
offset adding step, the offset is still added to the fully deblocked pixels. For BO type, no modification
needed.
Bitrates (losses) HE: -0.1/-0.1/0.0/xx LC: -0.1/-0.1/-0.1/xx
Runtime Enc HE: 99/99/100/xx LC: 100/100/99/xx Dec HE: 100/102/104/xx LC: 101/101/102/xx
Can this be implemented as one pass also at the encoder? Must be studied.
Some concern that this may complicate the encoder, whereas the benefit seems to be unclear. No action.

JCTVC-G1011 Cross-check results of the Unified Deblocking and SAO of Sharp (JCTVC-
G608) [M. Narroschke, S. Esenlik (Panasonic)] [late]

JCTVC-G684 Subjective Tests on ALF and SAO Using HM-4.0 [W.-S. Kim, O. G. Sezer,
M. Budagavi (TI)]
This contribution presents results of informal subjective tests conducted on two loop filtering operations
in HM-4.0: SAO and ALF. Test videos were presented at their actual frame rates to the viewers. Three
configurations were tested using low delay B high efficiency (LB-HE) condition: HM-4.0 Anchor vs.
SAO-off+ALF-off, HM-4.0 Anchor vs. SAO-off+ALF-on, and HM-4.0 Anchor vs. SAO-on+ALF-off. In
HM-4.0 Anchor, both SAO and ALF are activated (SAO-on+ALF-on). Among these three configurations,
SAO-on+ALF-off gives subjective quality results better than or comparable to the HM-4.0 Anchor. This
contribution requests that JCT-VC conducts subjective quality test of SAO and ALF in a core experiment
setting to evaluate the subjective quality gains provided by the aforementioned tools, and decide on the

Page: 113 Date Saved: 2011-12-04


minimal set of configurations that provide the best subjective quality improvements, so that the optimal
performance can be achieved in the final standard in terms of implementation cost and visual quality.
Request subjective viewing on SAO and/or ALF
Was reported also in previous meeting.
Consider the possibility to conduct some formal subjective viewing in CE8 prior to the next meeting
(Minhua takes lead). It is reported that at least 5 companies in the San Jose area would be able to offer
viewing facilities for that. Vittorio should get involved in the preparation of such a viewing. To be tested:
ALF on/off with SAO on, possibly also both SAO/ALF on. Should be done with reasonable number of
viewers and test cases (i.e. sequences, bit rates).

5.6 Block structures and partitioning

5.6.1 NSQT (ref CE2)

JCTVC-G519 Non-CE2: Harmonization of implicit TU, AMP and NSQT [X. Zheng
(HiSilicon), Y. Yuan, Y. He (Tsinghua)]
This contribution provides a harmonization solution of implicit TU, AMP and NSQT. Experimental
results show that the proposed solution contributes average coding gain of 0.1% for RA, 0.1% for
RA_LC, 0.2% for LD_B, 0.1% for LD_B_LC, 0.1% for LD_P and 0.1% for LD_P_LC. Both encoder and
decoder complexity are same as HM4.0.
Decision: It is agreed that this harmonization is desirable. WD and software should be checked by WD
editor and Motorola (K. Panusupone). (confirm)

JCTVC-G852 Cross-check report on Harmonization of implicit TU, AMP and NSQT


(JCTVC-G519) [I.-K. Kim (Samsung)] [late]
Did not perform concise checking whether software and text match, but the code change is confirmed to
be minimum.

JCTVC-G521 Non-CE2: Non-square Hadamard transform for motion estimation and


merge estimation [X. Zheng, L. Liu (HiSilicon), Y. Yuan, Y. He (Tsinghua)]

This contribution provides a non-square hadamard transform solution which is used at motion estimation
and merge estimation process. Different configurations of non-square hadamard transform are also
discussed at the contribution. Experimental results show that non-square hadamard can achieve the
average gain of 0.2% for RA, 0.2% for RA_LC, 0.2% for LD_B, 0.3% for LD_B_LC, 0.2% for LD_P
and 0.3% for LD_P_LC when inter 2x8 and 8x2 transform is used at residual coding.
Presentation not uploaded.
Decision (SW): Adopt non-normative tools (but not the inclusion of 2x8/8x2 as also suggested in the
docnot 2x8/8x2)

JCTVC-G599 Non-CE2: Cross-Check of Non-square Hadamard transform for motion


estimation and merge estimation [S. Oudin, B. Bross(Fraunhofer HHI)] [late]

Page: 114 Date Saved: 2011-12-04


5.6.2 Partition mode/size coding

JCTVC-G151 Prediction and partition mode binarization for Low Delay P [X. Zhang, S.
Liu, S. Lei (MediaTek)]
This contribution reports a bug fix and a method for binarizing prediction and partition modes for Low
Delay P configuration. Firstly, a mismatch was found between WD and HM with regard to the prediction
and partition mode binarization for Low Delay P. With the bug fix in HM software, negligible (0.0%)
impact is reported on both BD-rate and encoding and decoding runtime. Secondly, it is proposed to unify
the prediction and partition mode binarization for Low Delay P and B, which simplifies both WD and
HM. Again, experimental results report negligible (0.0%) impact on both BD-rate and encoding and
decoding runtime.
Partially already discussed - see under G785.
Deviation between WD and software in binarization for P

JCTVC-G655 Cross-check of JCTVC-G151 on prediction and partition mode binarization


for Low Delay P [P. Lai, F. C. A. Fernandes (Samsung)]

JCTVC-G283 Residue Quad Tree Depth for Chroma in Intra Coding [X. Zhao, X. Guo, X.
Li, S. Lei (MediaTek), S. Ma, W. Gao (PKU)]
In HM4.0, luma and chroma components share the same maximum Residue Quad Tree (RQT) depth.
This contribution proposes to use separate maximum RQT depth for luma and chroma components in
intra coding. In specific, a user-defined parameter is added in the sequence parameter set (SPS) to allow
independent RQT depth setting for chroma. It is reported that with setting the parameter as 0, average
BD-Rate reductions of 0.04%, 1.67% and 1.70% are achieved for Y, U and V in AI-HE, respectively, and
0.20%, 2.43% and 2.53% are achieved for Y, U and V in AI-LC, respectively. It is also reported that the
decoding time is slightly decreased.
Question: Is there an implicit mode for chroma? Currently not.

JCTVC-G798 Cross verification of MediaTek’s proposed residue quad-tree depth for intra
chroma coding (JCTVC-G283) [Y. Chiu, L. Xu (Intel)] [late]

JCTVC-G442 Improvement to chroma TU specification [A. Minezawa, K. Sugimoto, S.


Sekiguchi (Mitsubishi), A. Ichigaya, S. Sakaida (NHK)]
This contribution proposes an improvement to chroma TU structure of the current HM. The proposed
method introduces up to 32x32 TU size for chroma when LCU size is 64x64. In addition, it is proposed to
apply fixed depth for chroma TU instead of using the same TU depth with luma component. The
proposed method applies TU depth=0 for CU with 2Nx2N PU, and applies TU depth=1 for CU with
rectangular PU and NxN PU while TU split for Luma is unchanged from HM-4. The proposed method
achieves 2.0%, 2.1% and 1.0% chroma BD-rate gain for AI-HE, RA-HE and LD-HE, and 2.9%, 2.0% and
0.9% chroma BD-rate gain for AI-LC, RA-LC and LD-LC settings with 2-3% encoding time reduction.
Luma BD-rate gain is on average 0.1% and 0.2% for HE and LC settings.
Difference compared to G283: Fixed RQT depth vs. variable RQT depth.

Page: 115 Date Saved: 2011-12-04


JCTVC-G378 Cross Check Report for Improvement to chroma TU specification (G442)
[Y. Shibahara, T. Nishi (Panasonic)]

JCTVC-G605 Cross-check of Improvement to chroma TU specification (JCTVC-G442) [P.


Helle, B. Bross (Fraunhofer HHI)] [late]

JCTVC-G980 Evaluation of Limiting Chroma Transform Depth in RQT on HM4.0 [L.


Guo, X. Wang, M. Karczewicz (Qualcomm)] [late]
In previous Geneva meeting (March 2011), contribution JCTVC-E377 proposed that the maximum
Chroma transform depth be limited relative to Luma component, so that Chroma component does not
have to share the same transform depth as Luma. In this contribution, the method described in E377 has
been implemented on top of HM 4.0. Simulation results show that by doing so an average BD-rate saving
of roughly X% can be obtained for U and V respectively with no loss on luma, using all three HE test
configurations.
Conclusion: Establish CE from G283, G442 and G980

JCTVC-G417 Improved coding of CU-level information in CAVLC S. Y. Yi (KAU), H.


Choi (Hanbat Univ.), S.-C. Lim (ETRI), J.-G. Kim (KAU), [H. Y. Kim, J.
Lee, J. S. Choi (ETRI)]
Does not need to be presented

JCTVC-G881 Cross-check for Improved coding of CU-level information in CAVLC


(JCTVC-G417) [T. Yamamoto (Sharp)] [late]

JCTVC-G709 On partition size information coding using CABAC [T. Yamamoto, K.


Misra, A. Segall (Sharp)]
This document describes a technique for CABAC complexity reduction while coding partition size
information in B slices. The document proposes the use of bypass coding engine of CABAC to code bins
of syntax element pred_type used to distinguish between PartMode (2NxN, 2NxnU, 2NxnD) as well as
bins used to distinguish between PartMode (Nx2N, nLx2N, nRx2N). This reduces the number of CABAC
contexts in memory and eliminates the associated CABAC context update step. It is asserted that this has
negligible impact on compression efficiency. For HM-4.0, high efficiency common test conditions the
proposed change shows an average BD bitrate impact of –
(Without class F sequences)
RA_HE Y:0.00% U:0.02% V:0.02%; LB_HE Y:-0.02% U:0.07% V:0.00%.
(With class F sequences)
RA_HE Y:0.00% U:0.01% V:0.00%; LB_HE Y:-0.04% U:0.09% V:-0.04%.
Presentation not uploaded.
Similar to G718 -> CE?

Page: 116 Date Saved: 2011-12-04


5.7 Motion compensation operation and interpolation filters

5.7.1 Interpolation filters and MV precision (ref CE3)

JCTVC-G062 non-CE3: 7-tap quarter-pel luma interpolation filter with accurate phase
shift [Hongbo Zhu]
A 7-tap 1/4-pel luma interpolation filter is proposed in this document. The simulations were
conducted under the common test condition [1] using the HM4.0 r1354. When combined with the
6-tap DCT-IF 1/2-pel filter {2,-9,39,39,-9,2}, the performance of the proposed filter is 0.0% for
he_ra, 0.0% for lc_ra, 0.0% for lb_he, -0.4% for lb_lc, 0.2% for lp_he and -0.3% for lp_lc in
bdbitrate. Basically, the 6H7Q filter shows gain for the high resolution sequences and shows loss on
the low resolution sequences (WVGA and WQVGA). When the half-pel filter is changed to DCT-
IF 8-tap {-1,4,-10,39,39,-10,4,-1}, the performance is -0.1% for he_ra, -0.2% for lc_ra, -0.2% for
lb_he, -0.7% for lb_lc, -0.1% for lp_he and -0.7% for lp_lc.
Major deviation from current design.
No cross-check

JCTVC-G131 PU-size dependent motion compensation filtering order [M. Budagavi, R. R.


Srinivasan (TI)]
This contribution presents a technique for reducing motion compensation (MC) cycles of rectangular PUs
by modifying MC filtering order depending on PU size. For the case of PU width < PU height, vertical
filtering is carried out first and then horizontal filtering instead of horizontal filtering followed by vertical
filtering as in HM-4.0. The filtering order is not changed for square PUs and rectangular PUs with PU
width > PU height. With the modified filtering order, it is claimed that MC computation cycles reduction
is in the range from 5% for 64x32 block to 35% for 16x4 block for the case when motion vector is
fractional in both x- and y-directions. The computation cycles for square PUs and rectangle PUs with PU
width > PU height do not change. The proposed change is normative for bit depths > 8 and is asserted to
reduce average number of MC cycles for both HE and LC settings. The BD-Rate for RA-HE, LB-HE,
LP-HE are all 0%. It is further claimed that if HEVC does not support 4x4 PU, 8x4 PU becomes worst
case block size from MC cycles point of view, and proposed technique will then reduce worst case MC
cycles too.
Presented Mon 28th evening.
Less relevant as we are not targeting 10 bit video (where the sequence H/V matters as intermediate
rounding needs to be performed to fit 16 bits). For 8 bit it could likely be non-normative.
Question: Was the 4x4 PU question discussed? (no, is one of the topics in JCTVG-G096 that is due for
plenary discussion).
Is related to the memory bandwidth / PU restrictions issue that will be studied in AHG.

JCTVC-G884 Verification of JCTVC-G131 on interpolation filter order [F. Bossen] [late]

JCTVC-G259 Modifications of LCU-based adaptive interpolation filter [S. Matsuo, S.


Takamura, H. Jozawa (NTT)]
This document reports two modifications of an LCU-based adaptive interpolation filter described in
JCTVC-G258. First modification is that the proposed filter is applied to chrominance components (Cb
and Cr) as well as luminance component (Y). Second modification is a prediction method for luminance
filter coefficients. The proposal was implemented in HM4.0 software to evaluate its performance.
Compared to the HM4.0 anchor, the overall average coding gain for Y, Cb and Cr were about 0.49%,
Page: 117 Date Saved: 2011-12-04
0.38% and 0.37%, respectively. The maximum coding gain of Y was about 10.2% for the sequence
“Vidyo3” in LP-LC case. The maximum coding gains of Cb and Cr were about 7.1% and 5.7% for the
sequence “Nebuta” in RA-LC case, respectively. The computational complexity at the encoder and
decoder were 101.42% and 102.51% on average, respectively.
Presented Mon28th evening.
Adaptation at LCU level
Benefit over G258 (from CE3) for chroma.
Further study

JCTVC-G347 Cross-check report of NTT's interpolation filter approach (JCTVC-G259)


[T. Yoshino, S. Naito (KDDI)]

JCTVC-G392 Non-CE3: Report on a restriction for small block [K. Kondo, T. Suzuki
(Sony)]
This contribution reports the results of restriction for small block. To cut worst case complexity, this
contribution tested three restrictions that small PU size is restricted without decoder change. The case-A
is that PU size 8x4, 4x8 and 8x8 for bi-prediction are restricted. The case-B is that PU size 8x4 and 4x8
are restricted. The case-C is that 8x4 and 4x8 for bi-prediction are restricted. With the restriction, it is
shown that the impact of coding efficiency is 2.2%, 1.4%, and 0.3% for case-A, B, and C.
Analysis shows that for large picture resolutions the loss by restricting the PU size is much less even for
case A (e.g. class A only 1%). Case A allows bandwidth reduction of around 50%. This indicates it is
very likely that small PU size restrictions for high resolutions could be meaningful.
Further study in AHG (see under CE3)

JCTVC-G429 Non-CE3: Cross-check report of Sony proposal on restriction for small


block (JCTVC-G392) [T.Chujoh (Toshiba)]

JCTVC-G600 Non-CE3: Adaptive Motion Vector Resolution based on the PU Size [J.
Jung, J. Heo, S. Yea (LG)]
This contribution proposes an adaptive mechanism for threshold selection at a PU level in the Progressive
Motion Vector Resolution (PMVR) method proposed by MediaTek. The result shows it improves the
coding efficiency of the PMVR method thanks to its PU-level adaptation of threshold values. This
contribution presents the result of the proposed scheme implemented on the PMVR method without 1/8
MV resolution. The Y BD-rate gains with respect to HM4.0 were -0.2% for RA HE, -0.2% for RA-LC, -
0.1% for LB-HE, -0.2% for LB-LC, 0.0% for LP-HE, and 0.0% for LP-LC. The Y BD-rate gains with
respect to the PMVR without 1/8-pel with Th=2 were 0.0% for RA HE, 0.0% for RA-LC, 0.0% for LB-
HE, 0.0% for LB-LC, -0.1% for LP-HE, and -0.1% for LP-LC.
Expectation that finer resolution of MVs is rather for large PU sizes.
The statistics plot shown may not indicate that this assumption is justified
Currently very small gain only for LD P.
No action.

Page: 118 Date Saved: 2011-12-04


JCTVC-G698 On Chroma interpolation filter [J. Lou, K. Minoo, L. Wang, A. Luthra
(Motorola Mobility)]
This contribution document proposes a new 4-tap half-pel Chroma interpolation filter for HEVC.
Compared to the current half-pel Chroma interpolation filter, better Rate-Distortion performance could be
achieved. When combined with the default HM Luma filters, it could achieve -0.01%/-0.77%/-0.94%
bitrate savings on Y/U/V. When combined with the 8H7Q filters with 1/4 offset, it could achieve -
0.03%/-0.67%/-0.80% bitrate savings on Y/U/V. When combined with the 8H7Q filters with 3/16 offset,
it could achieve -0.03%/-0.62%/-0.77% bitrate savings on Y/U/V. The cross-check will be provided by
Samsung.
Chroma can may break down
Filters are more complex (multiplication by 7 instead of 4, 39 instead of 36), but this does not seem to be
a big deal.
Preserves more high frequencies – is this useful for chroma?
Definitely, the filters could have significant impact on visual quality, but no subjective assessment was
performed by proponents, and cannot be done during the meeting
In SAO off case, there are some color artifacts with current filters
Further study: CE

JCTVC-G699 Motion vector scaling for non-uniform interpolation offset [J. Lou, K.
Minoo, L. Wang (Motorola Mobility)]
Non-uniform motion vector grid was proposed in the last Torino meeting. This contribution document
addresses the problem of motion vector scaling for non-uniform motion vector grid, since reusing the
motion vector scaling for uniform motion vector grid might give slightly different motion vector
predictors. Cross-check will be provided by Samsung. The attached spreadsheet contains detailed data of
the results.
Informative - no action required.

JCTVC-G825 Cross-check for Motorola Mobility proposals on MV scaling by Samsung [E.


Alshina, J.H. Park] [late]

JCTVC-G736 On luma/chroma interpolation precision [M. Coban, M. Karczewicz


(Qualcomm)]
This contribution presents a 16-bit biprediction luma/chroma interpolation process with negligible impact
on coding efficiency. In the current design, the intermediate values for calculating the biprediction sample
values have 17-bit values that are converted to 16-bit values with the use of an offset. When all the
intermediate values are reduced to 16-bit, there is no drop in performance. Simulation results show 0.0%,
0.0%, and 0.0% BD rate change for RA_HE, LB_HE and LP_HE cases, respectively.
No interest expressed by experts in the room – no action.
(also no cross-checker available to confirm or support this)

JCTVC-G693 Cross-check of Qualcomm's proposal on interpolation precision (G736) [Y.


H. Tan, C. Yeo (I2R)] [late]

Page: 119 Date Saved: 2011-12-04


JCTVC-G770 Non-CE3: A restriction of motion vector for small block size [T.Chujoh,
T.Yamakage (Toshiba)]
An experimental result of restriction of motion vector for small block size is reported. This is a non-
normative technology to reduce memory bandwidth for motion compensation. The worst cases of
memory bandwidth of interpolation process are two-dimensional interpolation positions for both
luminance and chrominance of bi-prediction block. Therefore, in order to reduce the worst case of
memory bandwidth, for example, two encoding methods that at least one motion vector of L0 or L1 is
restricted to the integer position for both luma and chroma when the block size of bi-prediction is less
than 8x8 and 4x8 bi-prediction block is prohibited in the former case are introduced at this time. As
experimental results, the gain coding efficiencies are lost an average of 0.33% and 0.38% respectively.
The worst cases of memory bandwidth are reduced greater than 25% and 31% respectively compared to
the anchor.
Similar to G392. In addition, restriction of fractional position for chroma is also introduced. Again, the
loss due to restrictions becomes lower for the larger picture sizes (class A)
Study in AHG

JCTVC-G898 Non-CE3: Cross check for Toshiba's proposal (JCTVC-G770) [K.


Kondo, T. Suzuki(Sony)] [late]

JCTVC-G931 Non CE3: Cross-check for memory band-width reduction from Toshiba (G-
770) by Samsung [E. Alshina, J.H. Park (Samsung)] [late]

JCTVC-G806 Non CE3: On the phase offset selection for motion compensation
interpolation filters [K. Minoo, D. Baylon, J. Lou]
In this contribution 4 sets of filters are introduced and used to conduct motion compensation with quarter-
pixel motion resolution (i.e. four level sub-pixel signaling and storing). The choice of filter set is decided
based on the sub-pixel information of the motion vector predictor (stored at quarter-pel resolution).
Overall 0.82% gains was observed for Luma. (0.2% gain was also observed for each of the Chroma
components)
Presentation not uploaded. Various graph plots are shown that are not included in the contribution to
motivate the idea, but are difficult to relate to the word file.
Selection of phase offset for each sub-pixel position is somehow based on the conditional distribution of
MV for a given MVP (including such positions as e.g. 3/16), but it is not explicitly said how.
Number of filters increased to 9. Main gain in LD P, in the other cases it is typically 0.3%.
Some interest expressed by cross-checker. No action.

JCTVC-G993 Non-CE3: Cross-check of Motorola Mobility's motion compensation


interpolation filter (JCTVC-G806) by Qualcomm [L. Guo (Qualcomm)]
[late]

Page: 120 Date Saved: 2011-12-04


JCTVC-G807 Non CE3: On design of interpolation filters for Uni and Bi Predictive Motion
Compensation [K. Minoo, D. Baylon, J. Lou] [late]
In this contribution, the authors propose to perform motion compensation with a mix of filters currently
proposed in JCTVC-G806 and JCTVC-G697. In essence, it is proposed to use 6-tap filters proposed in
JCTVC-G697 for Bi Prediction and 7&8-tap filters proposed in JCTVC-G806 filters for Uni-prediction
motion compensation. On average a gain of 1.2% was observed for Luma (and 0.4 and 0.3 gain for U and
V color components, respectively).
Some interest expressed by cross-checker. No action.

JCTVC-G994 Non-CE3: Cross-check of Motorola Mobility's Interpolation Filter for Uni


and Bi Predictive Motion Compensation (JCTVC-G807) by Qualcomm [L.
Guo (Qualcomm)] [late]

JCTVC-G824 Non CE3: Cross-check for Motorola Mobility proposals on Chroma


interpolation filter by Samsung [E. Alshina, J.H. Park] [late]

5.7.2 Prediction methods

JCTVC-G065 Improved Weighted Prediction [Y. Ye, E. S. Ryu (InterDigital)]


This contribution presents a modified weighted prediction process for bi-prediction. The modified WP
process improves accuracy by performing rounding operation only once for bi-prediction. Using the
fading sequences provided by the weighted prediction AhG at the July meeting, it is reported that, in the
RA_LC setting, on average 0.3 to 0.6% of BD rate reduction is achieved for luma, and 1.1% to 1.5% of
BD rate reduction for chroma; in the LD_LC setting, on average 0.6 to 0.8% of BD rate reduction is
achieved for luma, and 3.6 to 4.6% of BD rate reduction for chroma. The proposed modification does not
affect the weighted prediction process for input signal with bit depth larger than 8 bits.
Gets rid of one rounding, this is most likely where the gain comes from
Highest gain at higher rates
Current WP (as well as this one) has 32 bit precision – is this really necessary – would it be desirable to
reduce this?
Several experts express support for this
Decision: Adopt

JCTVC-G524 On weighted prediction [A. Tanizawa, T. Chujoh, T. Yamakage (Toshiba)]


[late]
Cross-check on JCTVC-G065 – confirms and supports

JCTVC-G307 Bi-prediction restriction in small PU [T. Ikai (Sharp)]


This contribution presents bi-prediction restriction to reduce motion compensation complexity. In the
proposed method, bi-prediction of 4x8 and 8x4 PU is prohibited with changes of inter_pred_flag syntax
and merge motion parameter derivation process. It is reported that BD-rate yields 0.0 %, 0.1 %, 0.0, and
0.0 for RAHE, LBHE, RALC, LBLC condition respectively. Enc Time reduces about 5 % and 7 %

Page: 121 Date Saved: 2011-12-04


compare to HM4.0 anchor in Hier-B case and B case respectively. Dec Time shows no impact. The
proposal was cross-checked by TI (JCTVC-G289).
This would be a normative change by prohibiting syntax element and change of the derivation process.
Compared to Sony’s results, where a significant loss occurred by disabling the small PU sizes, it is
surprising that we have no loss here – should be further investigated.
Study in AHG (the scope of the AHG is not restricted to “non-normative level only” changes).

JCTVC-G289 Cross-verification of Sharp’s proposal JCTVC-G307 on Bi-prediction


restriction in small PUs [M. Zhou (TI)]

JCTVC-G415 MC Complexity Reduction for Bi-prediction [H. Y. Kim (ETRI), K. Y. Kim,


G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
This contribution reports that roughly 27% and 6% area of forward B-Slices are observed to have
identical motion information within each PU, under HM4 LD and RA configuration, respectively. It is
proposed that when the L0 and L1 motion information is the same, the L1 interpolation process and the
weighted averaging process in HM4 should be bypassed for complexity reduction. It is reported that
average decoding time reduction of 4% was achieved for LD configurations without any change in coding
results.

JCTVC-G874 Cross-check MC Complexity Reduction for Bi-prediction (JCTVC-G415)


from ETRI [Patrice Onno (Canon)] [late]

JCTVC-G438 On complexity reduction of bi-prediction for identical motion [A. Tanizawa,


T. Shiodera, T. Chujoh, T. Yamakage (Toshiba)]
This contribution reports results of complexity reduction of bi-prediction for identical motion described in
JCTVC-F325. At the last JCT-VC Torino meeting, the implementation technique of complexity reduction
by changing bi-prediction to uni-prediction in case of identical motion was proposed. This contribution
provides a suitable source code and experimental results. Experimental results show that average
decoding times are 99%, 99%, 95% and 95% for RA-HE, RA-LC, LB-HE and LB-LC, respectively. This
scheme does not affect bitstreams, and then BD-bitrate is identical to the HM anchor.
Identical to G415.
Decision (SW): Adopt G415 and G438 tentatively. Leave it to the discretion of the software coordinator
whether this non-normative change of decoder optimization is desirable in reference software (Note: F356
was the same and with same conclusion).
(After the Torino meeting, software was provided but no answer was received).
HM description text shall also be provided.

JCTVC-G441 Redundancy removal of explicit weighted prediction syntax [A. Tanizawa, T.


Chujoh, T. Yamakage (Toshiba)]
This document presents a redundancy removal of explicit weighted prediction syntax. This scheme has
two proposals. First one (proposal 1) is to modify redundant pred_weight_table syntax for identical
reference frames. Second one (proposal 2) is to introduce the simple predictions to syntax elements in
pred_weight_table syntax in order to reduce the redundant representation. The experimental results in
HM software version 4.0 (HM-4.0-dev) with weighted prediction under common test conditions are
reported. The results show that BD-rate of proposal 1 is 0.3% and 2.5% for RA-HE and LDB-HE on

Page: 122 Date Saved: 2011-12-04


average, respectively. BD-rate of proposal 2 is 0.4%, 1.0% and 0.6% for RA-HE, LDB-HE and LDP-HE
on average, respectively. BD-rate of combination with prop. 1 and prop. 2 is 0.7%, 3.0% and 0.6% for
RA-HE, LDB-HE and LDP-HE on average. It is reported that these schemes does not affect the encoding
and decoding time.
Several experts support this
Decision: Adopt to WD conditional on availability of a confirmed statement about the alignment between
software and WD text by the cross-checker (which is not in the current version of G525)

JCTVC-G525 Cross-check of JCTVC-G441: Redundancy removal of explicit weighted


prediction syntax [P.Bordes, P.Salmon (Technicolor)] [late]

JCTVC-G858 Crosscheck of JCTVC-G549 proposed by Samsung [X. Zheng] [late]

5.8 Motion and mode coding


General remark: In AMVP, WD text and software are already quite complex, such that there is a certain
danger that mismatches occur (e.g. in candidate pruning process). For stability, we should be very careful
in adding in more small changes unless they are well justified; simplifications are welcome (see also
discussion under CE9/13)
Prepared in BoG,

5.8.1 AMVP/skip/merge (related to CE9/13)


The proposals in this category were first reviewed in a BoG (JCTVC-G1006), and decisions were taken in
track A. The BoG recommended a classification as follows:

Page: 123 Date Saved: 2011-12-04


A) Simplifications

A.1 Merge spatial candidate positions


JCTVC-G416 CU-based Merge Candidate List Construction [H. Y. Kim (ETRI), K. Y. Kim, S. M. Kim,
G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
JCTVC-G516 On Spatial MV Prediction [K. Sato (Sony)] [late]
Conclusion: No action

A.2 Merge partition redundancy removal


JCTVC-G181 Non-CE9: Merging candidate reordering [H. Takehara, S. Fukushima (JVC Kenwood)]
JCTVC-G681 Non-CE9: Simplified Merge candidate derivation [Y. Zheng, X. Wang, W.-J. Chien, M.
Karczewicz (Qualcomm)]
JCTVC-G593 Non-CE13: Simplification of merge mode [O. Bici, J. Lainema, K. Ugur (Nokia)]
JCTVC-G542 Non-CE9/Non-CE13: Simplification on AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
JCTVC-G416 CU-based Merge Candidate List Construction [H. Y. Kim (ETRI), K. Y. Kim, S. M. Kim,
G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
Decision:
1. Adopt the common part of all proposals (best expressed by removing void_merge_candidate
function in HM software).
2. From G681, G593 and G542: For 2NxN and Nx2N (and likewise NxLN etc. in AMP) remove the
1st PU candidate for the 2nd PU. (e.g. refer to Fig. 1 in G681)

Page: 124 Date Saved: 2011-12-04


A.3 1st pruning
JCTVC-G593 Non-CE13: Simplification of merge mode [O. Bici, J. Lainema, K. Ugur (Nokia)]
JCTVC-G241 Non-CE9: On parallel derivation of the temporal predictor for Merge/Skip modes [G.
Laroche, T. Poirier, P. Onno (Canon)]
JCTVC-G221 Non-CE13: The maximum number of merging candidates in P-slice [H. Nakamura, S.
Fukushima, H. Takehara, M. Ueda (JVC Kenwood)]
Conclusion: See under individual docs

A.4 Merge additional candidates (combined, non-scaled, zero mv,...)


JCTVC-G288 Non-CE13: Complexity Reduction of MVP List Contruction [Y. Yu, T. Hellman
(Broadcom)]
JCTVC-G593 Non-CE13: Simplification of merge mode [O. Bici, J. Lainema, K. Ugur (Nokia)]
JCTVC-G396 Non-CE9: swapping of merge candidate [C. Kim, Y. Jeon, B. Jeon (LG)]
JCTVC-G542 Non-CE9/Non-CE13: Simplification on AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
JCTVC-G397 Non-CE9/Non-CE13: Simplification of adding new merge candidates [B. Li (USTC), J. Xu
(Microsoft), H. Li (USTC)]
JCTVC-G683 Non-CE13: Simplification and improvement of additional merge candidate [Y. Zheng, X.
Wang, W.-J. Chien, M. Karczewicz (Qualcomm)]
JCTVC-G690 Non-CE9: Some possible motion vector coding related simplifications [Y. H. Tan, C. Yeo,
Z. Li (I2R)]
Conclusion:
G397 suggests to remove non-scaled candidates. Several experts express opinion that this is useful
Several proposals to avoid duplicate check
Side activity (Minhua), try to identify relation with T10, relation and commonalities of different methods,
come with text and software and results of possible unified solution.
(Result of this activity is captured in a revision of JCTVC-G1006):
Overall combination (G397) reports 0.1% loss in LD B case (mainly by class E)
If combined with simplification 3 of 542 (removing the second pruning step entirely) there is no loss
anymore
Two experts express that for LD P the encoding time may increase (which could be resolved by
restricting the number of candidates). One other expert reports that he has tested LD P, and found a 4%
increase in encoding time, whereas the BR reduces by 0.2% (inofficial result).
Decoder would be simplified by omitting the pruning step. Somehow yes, but can increase the number of
zero MV candidates, maximum (worst case) number of candidates not changed. WD text is simplified by
omitting limit and duplication check.
Decision: Adopt G397+G542 simplification 3.
(Note: G683 is similar as in removing 2nd pruning)

A.5 Merge temporal candidate refidx derivation

Page: 125 Date Saved: 2011-12-04


JCTVC-G163 Non-CE9: simplification of merge/skip TMVP ref_idx derivation [Y. Jeon, S. Park, J. Park,
B. Jeon (LG)]
JCTVC-G217 Non-CE9: Derivation process of reference indices for temporal merge candidates [H.
Nakamura, S. Fukushima (JVC Kenwood)]
JCTVC-G552 Simplification of temporal motion vector (TMVP) candidate derivation for Merge and
AMVP [I.-K. Kim, N. Shlyakhov, J. H. Park(Samsung)]
JCTVC-G592 Non-CE9: Removal of reference index derivation for TMVP in merge mode [O. Bici, J.
Lainema, K. Ugur (Nokia)]
JCTVC-G690 Non-CE9: Some possible motion vector coding related simplifications [Y. H. Tan, C. Yeo,
Z. Li (I2R)]
Conclusion: See under G163

A.6 Temporal candidate position


JCTVC-G082 non-CE9: Modified H positions for memory bandwidth reduction in TMVP derivation [M.
Zhou (TI)]
JCTVC-G552 Simplification of temporal motion vector (TMVP) candidate derivation for Merge and
AMVP [I.-K. Kim, N. Shlyakhov, J. H. Park(Samsung)]
Conclusion: See under G082

A.7 MV scaling
JCTVC-G223 Non-CE9: Division-free MV scaling [T.-D. Chuang, Y.-W. Chen, J.-L. Lin, C.-Y. Chen,
Y.-W. Huang, S. Lei (MediaTek)]
JCTVC-G541 Non-CE9: Simplified scaling calculation method for temporal/spatial MVP of
AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
JCTVC-G551 Restriction on motion vector scaling for Merge and AMVP [I.-K. Kim, Y. Park, N.
Shlyakhov, J. H. Park (Samsung)]
Decision: Adopt the change of scaling factor clipping range to a value of 16 (one of the options suggested
in G223)

A.8 AMVP
JCTVC-G182 Non-CE9: AMVP syntax for bi-prediction [H. Takehara, S. Fukushima (JVC Kenwood)]
JCTVC-G516 On Spatial MV Prediction [K. Sato (Sony)] [late]
JCTVC-G710 Non-CE9: The Parallel Friendly MVP Candidate Calculation for HEVC [Y. Yu, K.
Panusopone, L. Wang (Motorola Mobility)]
JCTVC-G712 Non-CE9: The Simplification of MVP for HEVC [Y. Yu, K. Panusopone, L. Wang
(Motorola Mobility)]
JCTVC-G219 Non-CE9: Construction of MVP list without using scaling [H. Nakamura, S. Fukushima
(JVC Kenwood)]
JCTVC-G542 Non-CE9/Non-CE13: Simplification on AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
Decision: Adopt simplification 2 from G542, no action on G219

Page: 126 Date Saved: 2011-12-04


A.9 Misc
JCTVC-G134 Motion Vector Predictor Candidate Clipping Removal [M. Coban, M. Karczewicz
(Qualcomm)]
Conclusion: See under G134

B) Parallel merge
JCTVC-G164 Non-CE9: improvement on parallelized merge/skip mode [Y. Jeon, S. Park, B. Jeon (LG)]
JCTVC-G387 Non-CE9 Parallel Merge/skip Mode for HEVC [X. Wen, O. Au, W. Dai, C. Pang, J. Dai,
F. Zou, X. Zhan (HKUST)]
JCTVC-G416 CU-based Merge Candidate List Construction [H. Y. Kim (ETRI), K. Y. Kim, S. M. Kim,
G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
Conclusion: See under G164
C) Coding efficiency improvements
JCTVC-G165 Non-CE9/Non-CE13: new MVP positions for merge/skip modes and its combination with
replacing redundant MVPs [Y. Jeon, S. Park, B. Jeon (LG), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei
(MediaTek)]
JCTVC-G195 Non-CE9/13: Averaged merge candidate [S. Shimada, K. Kazui, J. Koyama, A. Nakagawa
(Fujitsu)]
JCTVC-G305 Non-CE9: Bi-prediction for low delay coding [Y. Suzuki, A. Fujibayashi (NTT
DOCOMO)]
JCTVC-G343 Non-CE9: Improvement in temporal candidate of merge mode and AMVP [N. Zhang, X.
Fan, S. Ma, D. Zhao (Harbin Inst. Tech.)]
JCTVC-G224 Non-CE13: Multiple-scaled merging candidates [H. Nakamura, S. Fukushima (JVC
Kenwood)]
JCTVC-G787 Non-CE13: Additional merge candidates with MV dependent offsets [T. Lee, J. Chen, J. H.
Park (Samsung), G. Laroche, P. Onno (Canon), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
Conclusion: No action

Further details and dispositions under the individual documents as follows (not per category)

JCTVC-G082 non-CE9: Modified H positions for memory bandwidth reduction in TMVP


derivation [M. Zhou (TI)]
This contribution advocates modification of the H TMVP positions to avoid co-located motion data
required for TMVP derivation of the current LCU going beyond the current LCU row. In the proposed
design, the H TVMP positions around the bottom boundary of the current LCU are replaced with closest
ones inside the current LCU row (configuration 1), or with C3 TMVP positions (configuration 2). The
proposed algorithm enables LCU-aligned motion data compression, storage and fetch, and can reduce the
memory bandwidth for the TMVP derivation by half. The measured BD-rate penalty is 0.1% in RA-HE
and RA-LC, 0.2% in LB-HE and LB-LC for configuration 1, and 0.2% in RA-HE and RA-LC, 0.3% in
LB-HE and LB-LC for configuration 2.
Slides not uploaded

Page: 127 Date Saved: 2011-12-04


It was requested to discuss the need/trade-off of memory bandwidth reduction that may vary depending
on the implementation.
Simplification on the temporal candidate position
Several experts express support on this. Decision: adopt configuration 2 which is simpler and has less BD
rate loss (only 0.05%) according to cross-checker.

JCTVC-G274 Non-CE9: cross-check of contribution JCTVC-G082 (TI) on modified H


TMVP position [J. Jung (Orange Labs)]
Confirmed.
This verification contribution confirms the results obtained by TI in JCTVC-G082. According to the
results obtained, it is recommended to adopt configuration 2, which provides similar results as
configuration 1, allowing however cleaner WD text and software.

JCTVC-G134 Motion Vector Predictor Candidate Clipping Removal [M. Coban, M.


Karczewicz (Qualcomm)]
This contribution presents a modification to the motion vector predictor (MVP) candidate derivation
process that removes the final normative motion vector clipping operation. In the current HM design,
motion vector predictor values are clipped using the non-normative motion vector clipping operation in
the motion compensation process. Removal of the motion vector clipping operation on the MVP
candidates reduces MVP derivation complexity with negligible impact on coding efficiency. Simulation
results show 0.0%, 0.0%, and 0.0% BD rate change for RA_HE, LB_HE and LP_HE cases, respectively.
One participant was asking whether clipping in motion compensation can be removed as well.
Note: The implementation of the clipping function in software currently is wrongly implemented
Decision: adopt. Remove the current clip_mv function, implement proper (non-normative) clipping
function (to maximum size) in MC (Qualcomm will implement that).

JCTVC-G344 Cross-verification report for motion vector predictor candidate clipping


removal (JCTVC-G134) [H. Aoki, K. Chono (NEC)]
Nobody was there to report. Comments provided by email on the reflector:
Verification task was successfully completed; it was confirmed that the implementation conforms to the
description in JCTVC-G134 and results exactly match those presented in JCTVC-G134.
The necessity of the clipping is also raised as one of open issues to be discussed in the WD4 document. It
is recommended that the clipping issue is discussed and fixed at the 7th JCT-VC meeting.

JCTVC-G163 Non-CE9: simplification of merge/skip TMVP ref_idx derivation [Y. Jeon, S.


Park, J. Park, B. Jeon (LG)]
This contribution reports the results when the derivation process of reference indices for temporal
merging candidate is simplified. Five simplifications SP01 to SP06 are tested. Simulation results revealed
that no coding loss is obtained in all cases except low delay configurations of SP01. Based on test results,
it is recommended to consider one of the five simplifications SP02, SP03, SP04, SP05 and SP06 because
the refIdx derivation process can be simplified without any harm on coding efficiency. Among the four
recommended simplifications, SP05 and SP06 are the most preferred since they are the simplest ones.
Slides not uploaded
Related to the low delay coding option. What are the losses without low delay coding option?
Due to the lack of explicit reference list construction in current HM, the low delay coding option changes
the decoder as well.

Page: 128 Date Saved: 2011-12-04


An outlier (BQSquare) with high loss was reporting when fixing the refidx to 0.
Simplification of temporal refidx derivation for merge
In category A.5 (with 4 other contributions) this one has tested most variations.
Decision: Adopt SP05. Use left position ref_idx; if not available, use zero value.Further test to be
performed during meeting (Panasonic): Does this interferes with the ref_idx re-ordering of LD B and P
configurations. If not, adopt, otherwise potentially use another SP0X conf. It is verbally reported (Sat
morning) that there is no problem. This will be provided in a document. On Monday 28th the same was
confirmed in G1027.

JCTVC-G1027 Non-CE9: Additional test result on LG’s simplification of merge/skip


TMVP refidx derivation (JCTVC-G163) [T. Sugio (Panasonic), S.
Fukushima (JVC KENWOOD)] [late]
qq

JCTVC-G099 Non-CE9: cross-verification of LGE’s proposal JCTVC-G163 on


simplification of merge/skip ref_idx derivation [M. Zhou (TI)]
Confirmed.

JCTVC-G164 Non-CE9: improvement on parallelized merge/skip mode [Y. Jeon, S. Park,


B. Jeon (LG)]
Parallelized merge/skip is introduced at the Torino meeting in JCTVC-F069 to compensate the significant
quality loss that can be caused when the parallel motion estimation (PME) is performed with the current
HM design for throughput or implementation cost reasons. This contribution advocates JCTVC-F069
method and additionally makes a change in the merge/skip list construction process of JCTVC-F069 to
improve the coding efficiency of JCTVC-F069 further. The simulation results revealed that this change
provides significant coding efficiency improvement relative to JCTVC-F069 method resulting in 1.0%,
0.7%, 0.3%, and 0.0% BD rate reductions on average of all configurations (RA-HE, RA-LC, LB-HE, and
LB-LC) for PME level 64x64, 32x32, 16x16, and 8x8 respectively. Compared to the HM4.0 PME in
which the merge and skip modes have to be disabled inside the motion estimation region called MER
except the first partition of MER, 5.6%, 4.8%, 2.5%, and 0.3% BD rate reductions are achieved on
average of all configurations for PME level 64x64, 32x32, 16x16, and 8x8 respectively.
Slides not uploaded
General concerns on this parallel merge technique are to be discussed when presenting the CE9 proposals.
To be considered dependent on CE9 MRG_PAR (JCTVC-F069) discussion.
Improvement of CE9 test MRG_PAR JCTVC-F069.
“Scheme 5” of G387 is said to be exactly identical. It is said that the gap for 32x32 compared to HM
anchor would be 0.9% bit rate increase. G416 has worse results
Setup an AHG which tries to reduce the performance gap between parallel and non-parallel merge, work
on better understanding of the problem and alternative solutions that would not need syntax modification

JCTVC-G100 Non-CE9: cross-verification of LGE’s proposal JCTVC-G164 on


improvement on parallel merge/skip mode [M. Zhou (TI)]
JCTVC-F069 proponent is cross-checker confirms that the code is clean is does what is proposed.
Cross-checker thinks that it is a good improvement but should use top candidate instead of top left or
horizontal and vertical candidates rather than diagonal ones.

Page: 129 Date Saved: 2011-12-04


JCTVC-G922 Non-CE9: cross-check of LGE’s proposal JCTVC-G164 on improvement on
parallel merge/skip mode [Y. Park, H. Yang (Samsung)] [late]
It is also recommended to consider this proposal dependent on JCTVC-F069.

JCTVC-G387 Non-CE9 Parallel Merge/skip Mode for HEVC [X. Wen, O. Au, W. Dai, C.
Pang, J. Dai, F. Zou, X. Zhan (HKUST)]
The current HEVC merge/skip is copying motion parameters to the current PU from a candidate list
which consists of spatial and temporal neighbouring PUs. However, it is hard for parallel encoding and
decoding due to the data dependency. Furthermore, different shapes and position of PU would result in
different definition of candidate lists, this would lead to potentially extra hardware cost and not easy to be
efficiently implemented by the hardware. In this proposal, we propose a high level syntax to signal
parallel depth of merge/skip mode and divide a LCU into non-overlapped square merge regions (MRs).
All the PUs located inside the same MR use same candidate list of PUs at both encode and decoder side.
By doing this, all the PUs in the same MR can be checked by certain architecture in parallel. Simulation
results reveal that an average loss of 1.5%, 1.5%, 2.4% and 2.5% in RA-HE, RA-LC, LB-HE and LB-LC
for 32x32 block level parallel ME when compared to the current HM4.0 design. Provides different trade-
off points between saving logic and coding efficiency.
Slides not uploaded
To be considered dependent on CE9 MRG_PAR (JCTVC-F069) discussion.
Improvement, it is basically the same as JCTVC-G164.

JCTVC-G407 Non-CE9: Cross-check of parallelized merge/skip mode for HEVC (JCTVC-


G387) [B. Li (USTC), J. Xu (Microsoft)]
Confirmed.

JCTVC-G165 Non-CE9/Non-CE13: new MVP positions for merge/skip modes and its
combination with replacing redundant MVPs [Y. Jeon, S. Park, B. Jeon
(LG), J.-L. Lin, Y.-W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution proposes new MVP positions for skip and merge mode. Two new spatial MVP
candidate positions (A2 and B3) are introduced in addition to the spatial MVP positions (A0, A1, B0, B1,
and B2) of the current HM. The proposed order for spatial MVP candidates is {A2, B3, A1, B1, B0, A0,
A2} but there are some restrictions for adding the candidates to the MVP list in order to minimize the
complexity which can be caused by introducing the two new candidates. Simulation results revealed that
the proposed method achieves 0.1% gain for RA configurations and 0.2% gain for both LB and LP
configurations without any increase in encoding and decoding time.
Slides not uploaded
Improvement, adding two spatial merge candidates.
Increases complexity with small gain. Several experts express negative opinions. No action.

JCTVC-G860 Non-CE9/Non-CE13: Cross-check of JCTVC-G165 on new MVP positions


for merge/skip modes [V. Seregin (Qualcomm)] [late]
Confirmed.

Page: 130 Date Saved: 2011-12-04


JCTVC-G104 Non-CE9: cross-verification of LGE and MediaTek’s proposal JCTVC-
G165 on new MVP positions for merge/skip modes and its combination with
replacing redundant MVPs [M. Zhou (TI)]
The gain here is associated with complexity increase of the first pruning process of the merge/skip list
derivation, which should be studied.
Confirmed.

JCTVC-G181 Non-CE9: Merging candidate reordering [H. Takehara, S. Fukushima (JVC


Kenwood)]
In the HM4.0, motion information of the merging candidate that is located inside of the CU is not to be
used In this proposal, instead of avoiding such merging candidate, the merging candidate is moved to the
last position in the merge list after the derivation process for merging candidates. The simulation results
report that the proposed technique provides 0.1% BD-rate gain for random access and 0.1%/0.2% gain for
low delay B settingMerge redundancy check removal
Same idea also proposed in JCTVC-G681, JCTVC-G396 and JCTVC-G593.
Simplification on merge partition redundancy removal

JCTVC-G468 Non-CE9: cross-verification of JVC KENWOOD's proposal on Merge


candidate reordering (JCTVC-G181) [K. Sugimoto, A. Minezawa, S.
Sekiguchi (Mitsubishi)] [late]
Confirmed.

JCTVC-G681 Non-CE9: Simplified Merge candidate derivation [Y. Zheng, X. Wang, W.-J.
Chien, M. Karczewicz (Qualcomm)]
This contribution proposes a change to the rules currently used in determining merge candidates for a PU
under 2NxN, Nx2N, NxN, or AMP mode. In HM4.0, when a current PU under these partition modes is
not the first PU in a CU, the motion information of each of its merge candidates is compared with that of
a previous PU to avoid a situation that a number of PUs share the same motion information so that the
current prediction information can be classified into a mode with less partitions. For example, every PU
has the same motion information under a mode other than 2Nx2N. This contribution proposes to remove
such comparison. The proposed changes reduce the operations and also enable parallel merge candidate
generation.Merge redundancy check removal
Remove the possibly redundant candidate may improve parallelism because PU1 does not depend on PU0
anymore.
Simplification on merge partition redundancy removal

JCTVC-G830 Non-CE9: Crosscheck for Qualcomm's simplified merge candidate


derivation in JCTVC-G681 [J.-L. Lin, Y.-W. Huang (MediaTek)] [late]
Confirmed.

JCTVC-G593 Non-CE13: Simplification of merge mode [O. Bici, J. Lainema, K. Ugur


(Nokia)]
This contribution comprises three parts aiming at reducing complexity of the merge process. In the first
part, it is proposed to reduce number of motion comparisons during the first pruning process by
comparing each spatial candidate with a limited number of spatial candidates and exempting the temporal
motion vector prediction candidate from the pruning process. In the second part, it is proposed to derive
redundancy information for the combined candidates during the first pruning process and exempting other

Page: 131 Date Saved: 2011-12-04


additional candidates from pruning. In the third part, number of contexts used for the merge index is
reduced to two and the contexts are assigned according to being a skip mode or inter-merge mode.
Combination of all the three parts provides the merge mode having a total of one full motion comparison
per PU where other motion comparisons are either removed or replaced by simpler operations. The
reported compression efficiency impact in terms of BD rate is 0.1%, 0.1%, 0.1%, 0.2% for RA_HE,
LB_HE, RA_LC and LB_LC, respectively.
Slides not uploaded
Source code bundled with the proposal.
Merge redundancy check removal
Additional modification of the first and second pruning as well as context reduction for CABAC.
Simplification
Further study (CE)

JCTVC-G940 Cross-check for JCTVC-593: Non-CE13: Simplification of merge mode


[Semih Esenlik (Panasonic)] [late]
Confirmed.

JCTVC-G396 Non-CE9: swapping of merge candidate [C. Kim, Y. Jeon, B. Jeon (LG)]
In this contribution, two methods for reordering the merge candidates are presented. The first method is
applied only for 2Nx2N PUs of square shape. The first method (Proposed method 1) swaps A1 and B1
candidate order in the merge list if the propose condition is satisfied. It is reportedly shown in the
experimental results that 0.1~0.2% BD rate reduction is achieved without increasing encoding and
decoding time with the proposed method 1. The second method (Proposed method 2) is applied for the
second PUs of 2NxN, 2NxnU, 2NxnD, Nx2N, nLx2N and nRx2N partitions of rectangular shape. The
proposed method 2 uses the MVP candidate which belongs to the first PU of those rectangular partitions
for creating the combined bi-pred. candidates even though this candidate shall not exist in the initial list
due to the avoiding check operation. It is reportedly shown in the experimental results that 0.1~0.2% BD
rate reduction is achieved without increasing encoding and decoding time with the proposed method 2. In
addition, the proposed methods are tested with the new anchor in which the avoiding check operation is
removed. It is reportedly shown from the simulation results that the proposed 2 achieves 0.0~0.1% BD
rate saving without encoding/decoding time increase relative to the new anchor and the combination of
the proposed method 1 and the proposed method 2 achieves 0.1~0.2% BD rate saving. The encoding and
decoding time is almost same as the anchor.
Slides not uploaded
Title was changed to “Non-CE9: reordering of merge candidate” after first registration.
It was mentioned that there is a dependency between first and second PU.
Improvement Method1
Simplification Method2

JCTVC-G484 Non-CE9: Crosscheck for LG's swapping of merge candidate in JCTVC-


G396 [J.-L. Lin, Y.-W. Huang (MediaTek)] [late]
Nobody was there to report. Document confirms.

JCTVC-G927 Non-CE9: Cross-check of JCTVC-G396 on reordering of merge candidate


[V. Seregin (Qualcomm)] [late]
Confirmed.

Page: 132 Date Saved: 2011-12-04


JCTVC-G944 Cross-check report on G396 [I.-K. Kim] [late]
Confirmed.

JCTVC-G542 Non-CE9/Non-CE13: Simplification on AMVP/Merge [T. Sugio, T.


Nishi(Panasonic)]
In this contribution, three simplification methods on AMVP/Merge were proposed. 1) On avoiding
merging candidate decision 2) On AMVP spatial scaling candidate 3) On zero motion vector merging
candidate. Experimental results reportedly showed no loss on average relative to the HM4.0 with
reducing complexity on AMVP/Merge.
Slides not uploaded
Simplification 1 on merge partition redundancy removal
Simplification 2 on AMVP scaling for spatial candidates
Simplification 3 zero motion vector merge candidate

JCTVC-G891 Non-CE9: cross-verification of Panasonic's simplification on AMVP/Merge


(JCTVC-G542) [M. Zhou (TI)] [late]
The proposed simplifications 2 and 3 of JCTVC-G542 are helpful for reducing the complexity of
merge/skip and AMVP list derivation process. It is recommended to consider this proposal together with
other contributions in this category.
Nobody was there to report. Document confirms.

JCTVC-G878 Non-CE9: Crosscheck for Panasonic's proposals (JCTVC-G542) by JVC-


KENWOOD [Hiroya Nakamura, Shigeru Fukushima (JVC Kenwood)] [late]
Confirmed.

JCTVC-G182 Non-CE9: AMVP syntax for bi-prediction [H. Takehara, S. Fukushima


(JVC Kenwood)]
The proposed simplifications 2 and 3 of JCTVC-G542 are helpful for reducing the complexity of
merge/skip and AMVP list derivation process. It is recommended to consider this proposal together with
other contributions in this category.
Changing the syntax order for ref_idx_l1. No impact CAVLC BD rate, small impact on CABAC due to
context dependencies.
Simplification for parsing throughput.
No support for this

JCTVC-G431 Non-CE9: Cross-verification of JVC Kenwood proposal on AMVP syntax


for bi-prediction (JCTVC-G182) [T.Chujoh (Toshiba)]
Confirmed.

JCTVC-G195 Non-CE9/13: Averaged merge candidate [S. Shimada, K. Kazui, J. Koyama,


A. Nakagawa (Fujitsu)]
This contribution proposes a new type of merge candidate, namely averaged merge candidate, in the
derivation process for luma motion vectors for merge mode. An averaged merge candidate is derived by

Page: 133 Date Saved: 2011-12-04


averaging a pair of two original merge candidates and added into the merge candidates list with higher
priority than other candidates previously adopted in WD4.
The coding gains of the proposed scheme over HM4.0 are 0.2% (HE), 0.3% (LC) in RA configurations,
0.0% (HE), 0.0% (LC) in LDB configurations and 0.3% (HE), 0.6% (LC) in LDP configurations. When
the proposed scheme works without non-scaled bi-predictive merge candidate in WD4, the additional
coding gains are 0.2% (HE), 0.2% (LC) in RA configurations, 0.0% (HE), 0.0% (LC) in LDB
configurations and 0.3% (HE), 0.6% (LC) in LDP configurations. Measured computational complexities
for various configurations in encoding and decoding are 96% - 102% and 100% - 101%, respectively.
Slides not uploaded
Test1: Adding up to averaged candidates
Test2: Replacing non-scaled bi-predictive candidates by averaged ones.
It was remarked that additional shifts and adds are required for the averaging and it is checked whether
the refidx of the two average candidates are equal.
It was assumed that removing the additional duplicate check may not have big impact in codign
efficiency.
Removing all duplicate checks was proposed in JCTVC-G397.
Improvement
Even test 2 is adding some complexity (checking ref_idx). Furthermore, with the reduction of candidates
as from CE13 T10 the gain could be even lower.
No support expressed by other companies.

JCTVC-G467 Non-CE9/13: cross-verification of Fujitsu's proposal on Merge (JCTVC-


G195) [K. Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)] [late]
Confirmed.
Reported runtimes are obtained on one type of computer and they are around 100%.

JCTVC-G217 Non-CE9: Derivation process of reference indices for temporal merge


candidates [H. Nakamura, S. Fukushima (JVC Kenwood)]
In HM4.0, the reference indices of temporal merging candidates are derived by the majority decision of
the reference indices of three neighboring blocks.
This contribution proposes that simplified derivation process of reference indices for temporal merging
candidates.
The simulation results report that the proposed technique provides no coding loss for random access
settings and low delay B settings.
Slides not uploaded
Same as one simplification (test2) in JCTVC-G163.

JCTVC-G742 Cross-check report on derivation process of reference indices for temporal


merge candidates (JCTVC-G217) [I.-K. Kim (Samsung)]
Confirmed

Page: 134 Date Saved: 2011-12-04


JCTVC-G219 Non-CE9: Construction of MVP list without using scaling [H. Nakamura, S.
Fukushima (JVC Kenwood)]
In the HM4.0, when a MVP list is constructed, two-times scaling operations are performed at a
maximum.
This contribution proposes a method which reduces the number of times of scaling operation.
The simulation results report that the proposed technique provides no gain and loss for random access and
low delay B settings.
Slides not uploaded
Simplification of AMVP list construction.

JCTVC-G105 Non-CE9: cross-verification of JVC’s proposal JCTVC-G219 on


construction of MVP list without using scaling operation [M. Zhou (TI)]
Confirmed
On the encoder side, this technique won’t save the number of MVP scaling, in the worst case still two
scales are needed. But it will reduce the worst case scale from 2 to 1 on the decoder side. The penalties
are increased latency because non-scaled and scaled MVP derivation processes now run sequentially (In
the current design the non-scaled MVP derivation can be hidden by the scaled MVP derivation); motion
vector reconstruction on decoder side needs two stages, i.e. first to use mvp_idx to compute the final
MVP, and then add MVDs to obtain the motion vectors, and additional storage for buffering the temporal
distances for all the MVPs.

JCTVC-G223 Non-CE9: Division-free MV scaling [T.-D. Chuang, Y.-W. Chen, J.-L. Lin,
C.-Y. Chen, Y.-W. Huang, S. Lei (MediaTek)]
In HM-4.0 motion vector (MV) scaling for the derivation of spatial and temporal motion vector predictors
(MVPs), a division operation is required to derive the scaling factor. In hardware and many DSP-based
platforms, dividers are undesirable because of larger gate counts and more processing cycles. In this
contribution, a division-free MV scaling is proposed to replace the general divider by a look-up table and
simple arithmetic operations. Moreover, the effective scaling range is doubled to deal with reference
pictures of longer temporal distances. Simulation results reportedly show that the proposed method has no
bit rate increase in random access conditions and even 0.1% bit rate reduction in low delay conditions.
The proposed design is also implemented in Verilog and synthesized with TSMC 40nm process. The
synthesis results reportedly show that the gate count of the MV scaling module is reduced by 54-58%.
Proposal 1: Extending the clipping range in the scaling factor clipping
Proposal 2: Division free MV scaling by using a LUT
HEVC uses the same scaling as AVC. The implementation of this HM4.0 scaling the proposal is
compared to may also be table-based. Additional slides were shown with comparing this with a tabled-
based HM4.0 scheme. [to be uploaded]
JCTVC-F068 reports that scaling needs 3 cycles. That is not affected by the tested proposal.

Adopt proposal (see below). No support for proposal 2JCTVC-G523 Non-CE9: Cross-check of
division-free MV scaling [P. Onno (Canon)] [late]
Divsion-free scaling is confirmed.
Clipping range extension was not verified.

Page: 135 Date Saved: 2011-12-04


JCTVC-G225 Non-CE9: Simplified AMVP derivation for Inter mode [J.-L. Lin, Y.-W.
Chen, Y.-W. Huang, S. Lei (MediaTek)]
No presentation was requested.

JCTVC-G341 Crosscheck of MediaTek's AMVP simplification (JCTVC-G225) [J. Park, S.


Park, B. Jeon (LG)]

JCTVC-G241 Non-CE9: On parallel derivation of the temporal predictor for Merge/Skip


modes [G. Laroche, T. Poirier, P. Onno (Canon)]
This contribution presents a simplification of motion vector derivation process for the Merge/Skip modes.
The aim of the proposed scheme is to reduce the number of cycles in hardware needed for this derivation
process for the worst case. The proposed scheme consists in scaling the temporal candidates in parallel to
the cascaded predictors and related pruning process. Based on the complexity analysis from Texas
Instruments in JCTVC-F088 and JCTVC-F068, it is reported that 3 cycles are saved for the Merge/Skip
mode derivation process (for the worst case hardware implementation with less than 0.1% BDR loss.
Slides not uploaded
Simplification of first pruning and last pruning process.
No support.

JCTVC-G098 Non-CE9: cross-verification of Canon’s proposal JCTVC-G241 on parallel


derivation of the temporal predictor for Merge/Skip modes [M. Zhou (TI)]
The proposed parallel MVP scaling will improve the throughput of the merge/skip MVP list derivation
process. The trade-off between coding efficiency and throughput improvement should be carefully
considered here.
The cross-checker is not sure whether this improvement of throughput is really needed and justifies the
loss.

JCTVC-G305 Non-CE9: Bi-prediction for low delay coding [Y. Suzuki, A. Fujibayashi
(NTT DOCOMO)]
In this contribution the additional option of bi-prediction for low delay conditions and key pictures of
random access conditions is proposed. In the proposed method, the motion vector difference for List 1 is
not signaled and it sets to (0, 0) when POCs of all reference pictures are smaller than that of a target
picture. The average Y BD-rate gains for low delay B condition and random access condition are 0.9%
and 0.2%, respectively.
Encoder only change (setting lis1 mvd to zero and signalling it) give 0.3% and 0.1% gain on average for
LB and RA. (available in revised document)
There is no list 1 motion vector search for bi-prediction.
It was asked what would be the impact when the list1 motion vector search for bi-prediction is enabled at
the encoder. The assumption is that if the list 1 estimation is better the loss of setting the list 1 mvd to
zero is increasing.
Interesting candidate to be considered after CE9/13 conclusion is reached.
Improvement

Page: 136 Date Saved: 2011-12-04


JCTVC-G603 Non-CE9: Cross-check of Bi-prediction for low delay coding (JCTVC-G305)
[P. Helle, B. Bross (Fraunhofer HHI)] [late]
Confirmed. The gain is achieved by enforcing MV differences to zero (which would be the encoder-only
change) and then enforcing no signalling of List 1. Does this have any visual impact?
Further study in CE, also more investigation whether this can be done in a non-normative way with
similar results.

JCTVC-G343 Non-CE9: Improvement in temporal candidate of merge mode and AMVP


[N. Zhang, X. Fan, S. Ma, D. Zhao (Harbin Inst. Tech.)]
This contribution proposes an improved scheme in temporal candidate of merge mode and AMVP. In
order to utilize motion information from the below and right sides more effectively, the scheme considers
three additional temporal candidate checking positions relative to HM4.0 in some cases. Simulation
results show that the proposed scheme in average provides 0.1% bit rate saving for random access, and
0.2% bit rate saving for low delay with almost the same time cost. This proposal have been crosschecked
by MSRA and the document number is JCTVC-G406.
Solution A conditional add
Solution B always add (prefered)
Improvement, add an additional temporal candidate for amvp and merge.

JCTVC-G406 Non-CE9: Improvement in temporal candidate of merge mode and AMVP


(JCTVC-G343) [B. Li (USTC), J. Xu (Microsoft)]
Nobody was there to report. Document confirms.

JCTVC-G397 Non-CE9/Non-CE13: Simplification of adding new merge candidates [B. Li


(USTC), J. Xu (Microsoft), H. Li (USTC)]
The parsing robustness issue was solved in JCTVC-F470 by a method similar to coding full index of
merge candidate. For some other syntax elements, the parsing dependency of the surrounding information
is also removed. One most important part of JCTVC-F470 is to compensate the loss introduced by coding
full merge index. Some new merge candidates are also added to the set of merge candidates, which was
adopted in the latest HM design. For each new added merge candidate, the duplication checking is also
performed. This contribution evaluates the benefits of each individual step and proposes to remove some
steps to simplify the design.
Simplification disable combined merge candidate and disable combined merge candidate duplicate
checking
Very interesting data for the discussion on the combined merge candidates.

JCTVC-G426 Cross-check of Microsoft’s proposal on simplification of adding new merge


candidates (JCTVC-G397) by Huawei [H. Yang, H. Yu (Huawei)]
Nobody was there to report. Document confirms.

JCTVC-G416 CU-based Merge Candidate List Construction [H. Y. Kim (ETRI), K. Y.


Kim, S. M. Kim, G. H. Park (KHU), S.-C. Lim, J. Lee, J. S. Choi (ETRI)]
This contribution proposes a CU-based approach for merge candidate list construction, where a CU can
have at most one merge candidate list that can be constructed prior to encoding and decoding of the
internal PUs. It is indicated that the proposed approach provides simpler design, reduced complexity, and
improved parallelism compared to the PU-based one used in HM4. Also, it is reported that roughly 3~6 %

Page: 137 Date Saved: 2011-12-04


encoding time reduction with the penalty of roughly 0.2~0.5% coding loss depending on the test
configurations.
Simplification of the merge candidate list construction
This is applied to all CU size whereas there are proposals restricting parallalism to certain sizes.

JCTVC-G899 Cross-check of CU-based Merge Candidate List Construction [S. Oudin, B.


Bross(Fraunhofer HHI)] [late]
confirmed

JCTVC-G516 On Spatial MV Prediction [K. Sato (Sony)] [late]


Under current HM specification motion information of topright PU is one of the candidates for spatial
motion vector prediction. When LCU-level parallel processing is applied for HEVC encoding/decoding,
extraction of motion information of topright PU becomes bottleneck. This contribution proposes not
using motion information of topright PU at LCU boundaries for motion vector coding. Average losses
with the proposed method is less than 0.2% for Y/U/V component with RA/RA_Loco/LD/LD_Loco
conditions.
Simplification
Not considered (late, nobody requests)

JCTVC-G799 Non-CE9: Crosscheck for Sony's proposals (JCTVC-G516) by JVC


KENWOOD [Hiroya Nakamura (JVC Kenwood)] [late]
Coding loss is considered acceptable.
Confirmed

JCTVC-G541 Non-CE9: Simplified scaling calculation method for temporal/spatial MVP


of AMVP/Merge [T. Sugio, T. Nishi(Panasonic)]
In this contribution, a simplified scaling calculation for AMVP/Merge was proposed. In the proposed
method, the scaling factor is quantized to a power of 2 based on the distance among current picture,
reference picture and co-located picture. Experimental results reportedly showed BD rate loss in the range
of 0.0-0.1% with reducing the number of multiplier relative to the HM4.0 by applying the proposed
method to temporal MVP in AMVP.
Test1: simplified scaling for temporal amvp candidate
Test2: simplified scaling for temporal candidate
Test3: simplified scaling for spatial amvp candidates
Combined results are not reported.
Simplification of the scaling based on temporal distance used in both amvp and merge (temporal and
combined candidates).

JCTVC-G102 Non-CE9: cross-verification of Panasonic’s proposal JCTVC-G541


simplified scaling calculation method for temporal/spatial MVP of
AMVP/Merge [M. Zhou (TI)]
The proposed scaling calculation will likely to reduce the cost in hardware implementation. Trade-off
between coding efficiency and implementation cost reduction should be evaluated, its implication on the
software implementations should be considered as well.

Page: 138 Date Saved: 2011-12-04


JCTVC-G551 Restriction on motion vector scaling for Merge and AMVP [I.-K. Kim, Y.
Park, N. Shlyakhov, J. H. Park (Samsung)]
In this contribution, a restriction method on motion vector scaling is proposed. When POC difference
between reference picture of current PU and reference picture of candidate PU (collocated PU or neighbor
PU) are larger than pre-determined threshold, motion vector scaling is restricted. The motion vector
predictor is marked as unavailable before the scaling process is performed. This restriction is applied to
both spatial and temporal scaling. Average coding efficiency gain from this modification for each
configuration are 0.0% (RAHE), 0.0% (RALC), 0.1% (LDHE), 0.0% (LDLC), 0.1% (LPHE), 0.1%
(LPLC) with class F sequences. Encoding and decoding time are 100% and 101%, respectively. Coding
efficiency gain for Class F only is 0.13%.
Should be considered in the AHG21 discussion.
It is mentioned that something like this is used in AVC. Nobody knows whether this is useful in HEVC.
In fact, it would slightly increase complexity as it is necessary to check a condition. Further study

JCTVC-G859 Crosscheck of JCTVC-G551 proposed by Samsung [X. Zheng] [late]


Nobody was there to report. Document confirms

JCTVC-G552 Simplification of temporal motion vector (TMVP) candidate derivation for


Merge and AMVP [I.-K. Kim, N. Shlyakhov, J. H. Park(Samsung)]
In this contribution, two simplification methods are proposed to simplify TMVP derivation process of
Merge and AMVP. The first item is utilization of upper-left position instead of center position. The
second one is utilization of zero reference index instead of the value which has majority of occurrence
among neighbours. By utilizing upper-left position, no additional calculation is required to get the
specific position since it is easy to find upper-left position which is starting point of PU. In addition,
utilizing zero reference index for Merge TMVP candidate saves the steps for checking and calculating the
majority of occurrence. Checking which one is majority is complex process which involves several
comparisons and conditional branches. When the both simplification is combined, the average loss is
0.0% with 100% and 99% encoding and decoding time, respectively, compared to HM-4.0.
Simplification1 of temporal motion vector position
Simplification2 of temporal motion vector index derivation
Questionable if this is a simplification (except that it is not necessary to compute the center position).

JCTVC-G227 Non-CE9: Cross-check for Samsung's proposals (JCTVC-G552) by JVC


KENWOOD [H. Nakamura, S. Fukushima (JVC Kenwood)]
Confirmed

JCTVC-G515 Cross-check of Samsung's TMVP simplification by Sony [K. Sato (Sony)]


[late]
Confirmed

JCTVC-G592 Non-CE9: Removal of reference index derivation for TMVP in merge mode
[O. Bici, J. Lainema, K. Ugur (Nokia)]
This was already discussed in several other documents JCTVC-Gxxx

Page: 139 Date Saved: 2011-12-04


JCTVC-G690 Non-CE9: Some possible motion vector coding related simplifications [Y. H.
Tan, C. Yeo, Z. Li (I2R)]
This contribution presents several modifications that are aimed at simplifying motion vector coding. The
first modification is related to the derivation of the L1 motion vector predictor (MVP) during bi-
directional prediction. L0 and L1 MVP candidates are derived pair-wise such that both sets of candidates
are derived from the same neighboring PU. This reportedly led to a coding loss of 0.0% and 0.1% BD-
rate loss for RA and LB configurations respectively. The second modification restricts the reference index
to zero during the derivation of the reference index of the temporal Merge candidate. This reportedly led
to a coding loss of 0.0% and 0.1% BD-rate loss for RA and LB configurations respectively. The third
proposed modification is to remove the context for coding the MVP index when CABAC is the entropy
coder, which reportedly resulted in an average of 0.0% BD-Rate loss. The last two modifications are
related to the addition and pruning process of ‘combined’ Merge candidates. The modified process limits
the number of combined candidates to five and uses a simplified pruning process. When applied together,
these two modifications result in an average of 0.0% BD-Rate loss in all configurations.
Simplification 1 related to AMVP_SEL03, AMVP_SEL04 CE9 tests
A concern was raised on the dependency between list 0 and list 1 mvp derivation.
To be discussed when CE9 test are reviewed.
Simplification 2 also setting refidx to 0 for temporal merge candidate
Simplification 3 code mvp_idx in bypass mode
Simplification 4 reduce the possible combination of combined bi-predictive candidates to 5
Simplification 5 combined bi-predictive candidates duplicate check is replaced by checking the
candidates before combination.
This is also addressed in JCTVC-G593 part 2

JCTVC-G970 Cross verification of Some possible motion vector coding related


simplifications (JCTVC-G690) [M. Coban (Qualcomm)] [late]
Simplification 4 and 5 confirmed

JCTVC-G752 Non-CE9: Crosscheck of I2R's proposal on motion vector coding related


simplifications (JCTVC-G690) [X. Zheng (HiSilicon)] [late]
Simplification 3, confirmed.

JCTVC-G733 cross check for I2R AMVP simplification [Yue Yu, Krit Panusopone, Limin
Wang] [late]
Simplification 1 confirmed

JCTVC-G710 Non-CE9: The Parallel Friendly MVP Candidate Calculation for HEVC [Y.
Yu, K. Panusopone, L. Wang (Motorola Mobility)]
This contribution proposes a parallel friendly MVP candidate calculation when a CU consists of two PUs.
Simulation results show that there is a loss of 0.3% for random access condition and a loss of 0.1% for
low delay condition compared to original AMVP while it is possible to parallel process two PUs.
Slides not uploaded
Results are reported for simplification combined with improvement.
Simplification of AMVP replace PU0 candidate for PU1 by one outside PU0

Page: 140 Date Saved: 2011-12-04


Improvement of AMVP by swapping indices, i.e. having different list orders for PU0 and PU1
No support for this

JCTVC-G712 Non-CE9: The Simplification of MVP for HEVC [Y. Yu, K. Panusopone, L.
Wang (Motorola Mobility)]
This contribution proposes a simplification of MVP design for HEVC. Simulation results show that a loss
of 0.1% for LBHE and a loss of 0.3% for RAHE are observed compared to original AMVP while the
complexity of the proposed method is reduced at least 50%, and up to 66.7% as compared to the MVP
selection procedure of AMVP when a CU consist of two PUs.
Slides not uploaded
Simplification of AMVP where PU1 MVP is derived from PU0 MVP
Concern was raised on the additional scaling.
It cannot be combined with JCTVC-G710.
Losses of 1.1% for SteamLocomotive.
Does not look like a real simplification. No support for this

JCTVC-G740 Cross-check of Motorola's proposals on motion vector coding (G710, G712)


[Y. H. Tan, C. Yeo (I2R)] [late]
Confirmed

JCTVC-G221 Non-CE13: The maximum number of merging candidates in P-slice [H.


Nakamura, S. Fukushima, H. Takehara, M. Ueda (JVC Kenwood)]
In the HM 4.0, the maximum number of merging candidates is fixed to 5 as maxNumMergeCand in both
P-slice and B-slice. This contribution reports the results of investigation of the variable
maxNumMergeCand in low delay P settings. The simulation results report that 0.1% BD-rate gain is
obtained under the setting of maxNumMergeCand equal to 3.
Slides not uploaded
Not cross-checked.
Proposal assumes a slice header flag signaling the maximum number of merge candidates as proposed in
CE13 tests. These tests has to be considered first.
Simplification
This is enabled by the adopting G091 T5…T8 from CE13.
Note: There would be a reasonable possibility to impose a restriction to 3 candidates in common test
conditions for LD P. It is however suggested not to do this now as it could affect investigations that target
increasing the number of candidates.

JCTVC-G224 Non-CE13: Multiple-scaled merging candidates [H. Nakamura, S.


Fukushima (JVC Kenwood)]
This proposal is for the improvement of the merge mode. In the HM4.0, for the merge mode, some spatial
merging candidates and a temporal merging candidate are added into a merging candidate list. After that,
if needed, combined bi-predictive merging candidates, a non-scaled bi-predictive merging candidate and
zero motion vector merging candidates are added. The non-scaled bi-predictive merging candidate is
Page: 141 Date Saved: 2011-12-04
added under a random access condition. Therefore, the multiple-scaled merging candidate by extending
the non-scaled merging candidate for low delay conditions is proposed in this contribution. The
simulation results report that the proposed techniques provide no gain for random access setting, 0.1-0.2%
BD-rate gain for low delay B setting, and 0.1-0.2% BD-rate gain for low delay P setting.
Slides not uploaded
Improvement of non-scaled combined merge candidates by adding scaling.

JCTVC-G544 Non-CE13: Cross check report of JCTVC-G224 by panasonic [T. Sugio, T.


Nishi(Panasonic)] [late]
Confirmed.

JCTVC-G288 Non-CE13: Complexity Reduction of MVP List Contruction [Y. Yu, T.


Hellman (Broadcom)]
Proponent not there to present.

JCTVC-G097 Non-CE13: cross-verification of Broadcom’s proposal JCTVC-G288


complexity reduction of MVP list construction [M. Zhou (TI)]

JCTVC-G683 Non-CE13: Simplification and improvement of additional merge candidate


[Y. Zheng, X. Wang, W.-J. Chien, M. Karczewicz (Qualcomm)]
This proposal presents several changes to simplify the additional merge candidates generating process. In
the proposal, scaled bi-pred candidate is removed, the additional merge candidate pruning operation is
limited to the combined bi-pred candidates only, and motion offset candidates are added. With fewer
operations, the proposed changes provide 0.3% bitrate saving in average based on common test
conditions.
Simplification: Disable remove duplicates for zero mv and remove the non-scaled bi-predictive merge
candidates (also tested in JCTVC-G397)
Improvement: Add motion offset candidates

JCTVC-G872 Non-CE13: crosscheck of Qualcomm’s proposal G683 on simplification and


improvement of additional merge candidate [Y. Jeon, B. Jeon (LG)] [late]
confirmed but reports slightly increased runtimes

JCTVC-G787 Non-CE13: Additional merge candidates with MV dependent offsets [T. Lee,
J. Chen, J. H. Park (Samsung), G. Laroche, P. Onno (Canon), J.-L. Lin, Y.-
W. Chen, Y.-W. Huang, S. Lei (MediaTek)]
This contribution presents additional Merge/AMVP candidates to compensate empty positions in the
fixed length candidates list. The additional Merge/AMVP candidates are produced by adding offsets to
the first existing candidate where the offset value is chosen according to the first candidate’s MV value.
Relative to the HM4.0+MRG_ENC_FIX which is reported in JCTVC-G776, experimental results showed
-0.3% BR savings in average of results in RA-HE, RA-LC, LB-HE, LB-LC, LP-HE, LP-LC.
Combination of CE13 tests. Should be revisitedWas discussed again when thea conclusion on CE9/13
tests wais reachdiscussed (see notes elsewhere).
A concern was raised regarding losses in chroma and luma for class E sequences. This may be dues to
removing zero motion vector merge candidates.
The fast encoding method can also be applied to other proposals.

Page: 142 Date Saved: 2011-12-04


Improvement on additional merge candidates

JCTVC-G926 Non-CE13: Cross-check of JCTVC-G787 on MV dependent offset [V.


Seregin (Qualcomm)] [late]
It should be assured that the merge index encoder estimation bugfix is included in anchor and tested
Confirmed.

JCTVC-G839 Non-CE13: cross-verification of Samsung, Canon, MediaTek proposal


JCTVC-G787 on additional merge candidates with MV dependent offsets [Y.
Jeon, B. Jeon (LG)] [late]
Confirmed

JCTVC-G946 Non-CE13: Cross-check report of JCTVC-G787 (MvDepOffset+Fast) [S.-C.


Lim, J. Lee, H. Y. Kim (ETRI), K. Y. Kim, S. M. Kim, G. H. Park (KHU)]
[late]

5.8.2 MV and mode coding

JCTVC-G209 Modified method for coding mvd in the CABAC mode [S.-T. Hsiang, S. Lei
(MediaTek)]
This contribution proposes a modified method for coding the absolute value of each component of vectors
mvd_l0 or mvd_l1 in the CABAC mode. The proposed method attempts to further utilize context
modeling for coding the EG 1 prefix bins. Experimental results reportedly show Y BD-rate gains 0.1%,
and 0.0% for HE-RA and HE-LDB, respectively, under the common test conditions and Y BD-rate gains
0.3%, 0.1% for HE-RA and HE-LDB, respectively, over QP = 32, 37, 42, and 47.
Increases complexity. No significant gain.
No interest expressed

JCTVC-G985 Crosscheck on JCTVC-G209 Modified method for coding mvd in the


CABAC mode [W. -J. Chien (Qualcomm)] [late]

JCTVC-G705 CABAC simplification for explicit signaling mode of AMVP [K. Misra, A.
Segall (Sharp)]
This document proposes to remove the CABAC context models for the motion vector predictor index,
mvp_idx_lX (mvp_idx,l0, mvp_idx_l1, mvp_idx_lc), and use CABAC bypass mode instead. It is asserted
that this change reduces the number of CABAC contexts in memory and eliminates the associated
CABAC context update step while having negligible impact on compression efficiency. For HM-4.0, high
efficiency common test conditions, the proposed change shows an average BD bitrate impact of –
(Without Class F sequences)
RA_HE Y:0.0% U:0.0% V:0.0%; LB_HE Y:0.0% U:0.1% V:0.1%; LP_HE Y:0.0% U:0.2% V:-0.1%.

Page: 143 Date Saved: 2011-12-04


(With Class F sequences)
RA_HE Y:0.0% U:0.0% V:0.0%; LB_HE Y:0.0% U:0.1% V:0.1%; LP_HE Y:0.0% U:0.1% V:-0.1%.
Slightly increases the number of bypass mode (by 1% perhaps)
G690 is the same (presented in breakout)
More entropy coding (reduction of number of contexts by 2 according to software and 3 according to text)
Results may somehow be interrelated with the RDO in ME decision
No interest expressed

JCTVC-G996 Cross check of Sharp's JCTVC-G705 [G.v.d.Auwera (Qualcomm] [late]

JCTVC-G785 Unified Pred_type coding in CABAC [Y. Piao, J. Min, J.H. Park (Samsung)]
This document proposes to unify pred_type coding of slice B and P in CABAC. Prediction mode and
partition mode coding in slice B is modified to be same as that of slice P to simplify the pred_type coding
and enhance the readability of specification. The average BD-rate by this unification is 0.01% in RA and
0.03% in LD high efficiency configuration. With bug fix reported in JCTVC-G655, the average BD-rate
is 0.xx% in RA and 0.xx% in LD.
No results yet on bug fix
In software implementation, there are some restrictions on B slice. There are also some mismatches
between text and software for P slices, so this one would introduce this bug in B as well.
G151 is related (does it the other way round) (both were mutually cross-checkers), also G718.
(there was the intention that intra would be more probable in P than NxN and 2Nx2N, which is the only
difference)
Disposition:
- We want only one solution
- Side activity (joint with WD and software coordinator, see result in JCTVC-G1042)

JCTVC-G1042 Harmonization of the prediction and partitioning mode binarization of P


and B slices [S. Oudin, B Bross et al.] [late]
qq
Suggest to use P slice binarization (as per G785), but cleaning up various issues. One issue about
describing context derivation may need further study and cleanup.
Decision: Adopt.

JCTVC-G906 Cross check report of Samsung's Unified Pred_type coding in CABAC


(JCTVC-785) [W. Zhang (ZTE)] [late]

Page: 144 Date Saved: 2011-12-04


5.8.3 Reference picture list management (subsumed under HL syntax)

5.9 High-level syntax and slice structure

5.9.1 High-level syntax and systems usage of bitstreams

JCTVC-G110 Implication of parallelized bitstreams on single core decoder architectures


[W. Wan, T. Hellman (Broadcom)]
There have been several methods proposed and adopted in previous JCT-VC meetings to enable
parallelism for multi-core architectures. It was reported to be currently unclear whether keeping these
multiple approaches is necessary in the final HEVC design as they may be redundant and/or could be
unified. This contribution suggests that single core decoder architectures may not benefit and in fact, may
be penalized by having to support some of these parallelism techniques. As an example of this, this
submission reports on the impact of several parallelism techniques on a single core hardware-based
design in an effort to aid the effort to better understand the trade-offs between different techniques to
enable parallelism.
Discussion:
 Slices become difficult when the number of slices gets high – establish appropriate level limits,
etc., as in AVC
 Entropy slices, similar to regular slices, require CABAC initialization – can be handled by a
similar approach
 Wavefronts – similar re-initialization issue – also requires multiple sub-stream pointers –
processing affected by maximum number and interleaving of sub-streams (which may mean that
there should be bounds on these aspects)
 Tiles – impact depends on the variant:
o tile_boundary_independence_idc
o potential interaction with filtering across tile boundaries
o Storage requirements for vertical tile boundaries – substantial penalty. It was remarked
that G197 can help with this.
Among these issues, most were suggested to be minor (if we pay attention to the details of the
implications) except, depending on how it is handled, the vertical boundaries of tiles.

JCTVC-G149 Options for High-Level Syntax for Multistandard Scalability [S. Wenger
(Vidyo)]
(Submitted as an information document.)
Discussed are options for high level syntax for multistandard scalability, with a focus on the case of an
AVC or SVC base layer, and one or more spatial/SNR HEVC enhancement layers, and with RTP
transport/multiplexing. A few remarks are included with respect to other standards and other
transport/multiplexing schemes. It is the author’s belief that multistandard-scalability can be enabled with
only moderate increases in high level syntax complexity.
In AVC there are 6 remaining “reserved” (falling into two categories of allowed NAL unit ordering
behaviour) and 9 “unspecified” (at least 6 of which have been used by others).
The emphasized potential approach is to use external multiplexing (e.g. through RTP), although this may
require a relatively tight design coupling with specification of external interface points.
It was noted that the NAL unit type design for HEVC uses a different number of bits than AVC.

Page: 145 Date Saved: 2011-12-04


5.9.2 Supplemental information

JCTVC-G079 SEI message for display orientation information [J. Boyce, D. Hong, S.
Wenger (Vidyo)]
This contribution proposes an SEI message for describing display orientation information, to be included
in an amendment to include in the HEVC design (and in AVC, although that is outside the scope of JCT-
VC). The proposed SEI message indicates to the renderer a request to rotate and/or flip the decoded
picture for proper display, after the normal decoding process. Because handheld video capturing devices
allow changing the picture capture orientation dynamically, using an SEI message allows dynamic
changes to the picture display orientation, temporally aligned with the compressed video data.
The same proposal is being made to MPEG as m21659, and to VCEG as T09-SG16-C-0690, for
consideration as an amendment to AVC. The concept was originally proposed in contribution JCTVC-
E280.
It was asked why the semantics are in units of degrees, and in whole-integer units in particular, and why
this is variable-length coded. The contributor indicated that the VLC coding was an error, and that there
was no particular need for whole-degree units.
A participant remarked that the persistence syntax should be checked.
Another participant expressed potential interest in three-dimensional coordinates rather than just rotation
(and flip).
It was noted that there was a “shall” in the proposed text, which was indicated to be an error.
Our plan of action is to wait to see what the parent bodies do with this in the AVC context and coordinate
the outcome.

JCTVC-G092 AHG22: High-level signaling of lossless coding mode in HEVC [M. Zhou
(TI)]
Efficient lossless coding is asserted to be required for real-word applications such as automotive vision,
video conferencing and long distance education. This contribution proposes signalling methods to enable
lossless coding at picture, region and LCU levels. Specifically, sps_lossless_coding_enabled_flag and
pps_lossless_coding_enabled_flag are defined in SPS and PPS to signal whether a group of pictures are
encoded losslessly; if pps_lossless_coding_enabled_flag is not set in PPS,
aps_lossless_coding_enabled_flag is defined in APS to indicate whether there are regions in a picture are
encoded losslessly, if yes, the lossless coding region information is encoded into in APS; if
aps_lossless_coding_enabled_flag is not set in APS, slice_lossless_coding_enabled_flag is defined in
slice_header() to signal whether 1-bit lossless coding flag is present at LCU level. This three-level
signaling method provides a flexible way of signaling lossless coding for different use cases.
This proposal focuses only on high-level syntax for support of this functionality. (Obviously, that is only
desirable if the low-level syntax also supports this functionality.)
Further study of lossless coding has already been (and remains) encouraged, and we have had AHG22 to
investigate this topic.

JCTVC-G583 Reducing output delay for "bumping" process [J. Samuelsson, R. Sjöberg
(Ericsson)]
(Discussion chaired by J. Boyce.)
This document proposes to add a flag to the slice header called output_process_flag and that the
“bumping” process is replaced by an output process invoked based on the value of output_process_flag of

Page: 146 Date Saved: 2011-12-04


the current picture. The document claims that the use of output_process_flag reduces unnecessary picture
output delay. It is also claimed to be well suited for temporal scalability, in which sub-sequences may use
a smaller DPB size.
When the decoder receives a picture with the flag set, all pictures with lower POC values are ready to be
output.
Would replace the current bumping process. This may restrict the encoder’s possible picture coding
patterns and impose additional constraints including buffering capacity impacts. Perhaps a real-time
encoder would not know this information at the time of encoding. Possible error resiliency concerns if a
picture with this flag set was lost.
Suggestion to retain the existing bumping process but add this as a decoder hint, put a flag in the
sequence header to indicate the presence of this flag in the slice header. This would allow for quicker
output of frames at the decoder.
Suggestion to show example picture coding patterns where this would provide additional benefit over the
latency count limit adopted in JCTVC-G546. A participant pointed out that putting a non-normative
syntax element in the slice header would differ from past practice. Could be put in an SEI message, but is
a lot of overhead for a single bit flag. Could alternatively be put into a VUI-like part of the PPS.
For further study.

5.9.3 Parameter sets and slice header

JCTVC-G325 AHG15: Picture size signaling [Y. Chen, Y. -K. Wang, M. Karczewicz
(Qualcomm)]
The current HEVC WD signals the decoded picture size, for both width and height, in luma samples. In
this document, it is proposed that to signal the coded picture size is in units of LCUs, and in addition to
signal the offset between the coded picture size and the decoded picture size, in units of SCUs.
Furthermore, this document raises a discussion on the value range of picture sizes.
Remarks:
 What about the cropping window? That would still be used on top of this, as in AVC.
 It was suggested that if the picture size is signalled in LCU and SCU units, those should be
signalled before the parameters that depend on them.
 It was noted that the proposed signalling, which would require computations involving several
variables to determine the width of the picture, seems undesirably obtuse. For now, let’s keep the
current syntax element pic_width_in_luma_samples. Decision: However, the width and height
should be coded using ue(v) rather than u(16). No range needs to be directly specified, as this
would be a profile/level constraint.
 The current software requires the image to be an integer multiple of the SCU size. The cropping
window is presumed to apply as in AVC to support other picture sizes.
 We note that this means that the actual width of an encoded picture to support a particular image
width in luma samples (e.g., a picture 14 samples wide) becomes a function of the selected SCU
size, which seems undesirable.
 Where should be the boundary for picture extrapolation for motion compensation purposes?
 The subject needs further thought.

Page: 147 Date Saved: 2011-12-04


JCTVC-G330 AHG15: Syntax elements in adaptation parameter set [Y. Chen, Y. -K.
Wang, M. Karczewicz (Qualcomm)]
This document discusses what syntax elements should be included into the Adaptation Parameter Set
(APS). Currently, the APS may include ALF and SAO parameters. In this document, inclusion of other
syntax elements, namely 1) reference picture list construction related syntax elements; 2) weighted
prediction related syntax elements; 3) decoded picture buffer management related syntax elements; and 4)
quantization matrices table, in APS is discussed.
For the RPLS and DPB aspects, there are other contributions relating to this (e.g. AHG21 related).
For the quantization matrices, previous discussions have indicated that these should not change within a
picture.
One question is whether the APS can be sent out of band. Another is how much data will be in it, how
often it can change, and whether it can change from slice to slice within the same picture.
For loss resilience, a system design might want to repeat the APS data or send it out of band, but that gets
difficult if it has a lot of data in it.
In AVC, quantization matrix data is sent at the SPS or PPS level. In HEVC, there are more transforms
and bigger transforms, and thus more data to represent quantization matrices.
It appears that weighted prediction and reference picture list construction data should be treated similarly.

JCTVC-G332 Multiple Adaptation Parameter Sets Referring [M. Li, P. Wu (ZTE)]


In this document, a multiple adaptation parameter sets (APSs) referring approach is proposed. As
described, the APS tends to carry the coding parameters which are more likely to be changeable from
picture to picture, or even from slice to slice. This referring design is proposed to be at the slice layer. In a
slice layer encoding/decoding, multiple APS IDs can be made available, and each coding tool, such as
SAO or ALF, etc., is initialized by activating only once in one APS of the several referred ones. The
encoder would be allowed to code SAO and/or ALF parameters by directly utilizing whole or partial
information already presented in the existing APSs. This facilitates flexible configuration of the coding
tools and saves bits when re-using the former tool parameters while coding the current slice. The
proposed scheme can also serve both the case when parameters change from slice to slice, such as in
video coding tool with Weighted Prediction, and the case when parameters intend to be unchanged for the
whole sequence.
Quantization matrix could also end up in the APS (not there currently). It was mentioned that JCTVC-
G295 also discusses the idea of different APS IDs for different aspects,. Contribution G658 was also
mentioned to be relevant.
At the moment we have three things in the APS. Each APS could hypothetically be populated with only
one of these things. It seems undesirable to have a scheme in which the size of the slice header would
keep growing as more things are put into the APS syntax.
The idea of parameter set updates was proposed in JCTVC-E309. It could be an interesting alternative.
For further study. Some form of increased flexibility for APS formatting and usage may be desirable.
A BoG (S. Wenger) was asked to consider APS-related contributions (G122, G220, G295, G330, G332,
G570, G658, G566).

JCTVC-G566 Syntax refinements for SAO and ALF [S. Esenlik, M. Narroschke, T. Wedi
(Panasonic)]
(Initial discussion in Track A.)
The Adaptation Parameter Set (APS) syntax structure has been adopted during the 6 th JCT-VC meeting in
Torino to be used in the conveyance of parameter sets of Adaptive Loop Filter (ALF) and Sample

Page: 148 Date Saved: 2011-12-04


Adaptive Offset (SAO). However APS is a very recent syntax structure whose performance and usage
needs to be improved and refined. In the current proposal 3 issues related to APS and ALF/SAO syntax
are addressed. The proposed solutions provide on average 0.2% coding gain for low-bitrate applications,
improve parsing robustness and reduce latency of HEVC.

3 issues related to APS and ALF/SAO syntax are addressed. The problems stated in this contribution are
resolved by three minor modifications in the slice header syntax structure and one minor modification in
the alf_cu_control_param syntax structure.
(1) Introduction of new flags in slice header (ALF_flag and SAO_flag). ALF_flag eliminates dependency
of decoding APS and slice header. (2) ALF_CU_control_flag is coded by fixed code.
1500Byte slice: HE: xx/0.0/0.1/xx LC: xx/0.0/0.0xx
If higher Qp is used, 0.1% gain is observed.
Comments from BoG: Worth to have ALF_flag in slice header (as the loss of this flag would affect
parsing). Support not using CABAC on ALF_cu_control_flag. SAO_flag might be useful in slice header
when previous frame’s APS parameters shall be re-used, but SAO flag in slice header shall overwrite the
previous APS. As this relates to concepts of using APS parameters, this relates to high-level syntax
(APS) discussion.
Decision: Adopt duplicate ALF_flag in slice header (value of duplicate flag must always match), discard
using CABAC for ALF_cu_control_flag.
Further study on duplicate SAO flag in slice header, as this may be conditional on decision on APS
concepts (re-use of previous SAO). May be discussed in track B.
(Further discussion in Track B.)
To Wenger BoG.

JCTVC-G122 VLC for high level syntax (ALF and SAO parameters) [V. Sze (TI)]
To Wenger BoG.

JCTVC-G606 Crosscheck - VLC for high level syntax (ALF and SAO parameters) (G122)
[T. Nguyen (Fraunhofer HHI)] [late]
To Wenger BoG.

JCTVC-G334 AHG15: On sequence parameter set and picture parameter set [Y. -K.
Wang, Y. Chen, Y. Zheng, W. -J. Chien (Qualcomm)]
This document includes some discussions on some SPS and PPS syntax elements, on their value ranges,
syntax element coding, and/or semantics.
Value ranges were discussed. It is clear that value ranges need to be specified, although it may be
appropriate for the range to be specified in the Profile/Level definitions rather than in the semantics
section in some cases.
Regarding the part of the proposal on a PPS-level flag to disable the temporal MV predictor, the
contributor indicated that it is not necessary to further consider this.

5.9.4 Tiles
BoG [M. Horowitz] to discuss tiles and wavefronts.

Page: 149 Date Saved: 2011-12-04


JCTVC-G183 Test result of low delay tile [K. Kazui, S. Shimada, J. Koyama, A.Nakagawa
(Fujitsu)]

JCTVC-G968 A Cross-check report for JCTVC-G183 on low delay tile [K. Misra, A. Segall
(Sharp)] [late]

JCTVC-G194 AHG4: Non-cross-tiles loop filtering for independent tiles [C.-Y. Tsai, C.-W.
Hsu, C.-Y. Chen, C.-M. Fu, Y.-W. Huang, S. Lei (MediaTek), A. Fuldseth
(Cisco)]
Same concept as disabling filtering across slice boundaries as adopted in Daegu in response to JCTVC-
D128.
Decision: Adopted (as recommended by BoG, see G1025).

JCTVC-G197 AHG4: Low latency CABAC initialization for dependent tiles [C.-W. Hsu,
C.-Y. Tsai, Y.-W. Huang, S. Lei (MediaTek)]
This proposal included not only CABAC initialization, but also requiring entry point signalling for
dependent tiles. (Entry points were already present for independent tiles.)
Decision: Adopted (as recommended by BoG, see G1025).

JCTVC-G961 Cross-check for JCTVC-G197 (AHG4: Low latency CABAC initialization


for dependent tiles) [J. Zhao, A. Segall (Sharp)] [late]

JCTVC-G315 AHG4: Unification of picture partitioning schemes [M. Coban, Y. -K. Wang,
M. Karczewicz, Y. Chen, I. S. Chong (Qualcomm)]

JCTVC-G317 AHG4: Dependency and loop filtering control over tile boundaries [Y. -K.
Wang, Y. Chen, I. S. Chong, M. Coban, M. Karczewicz (Qualcomm)]

JCTVC-G318 AHG4: Tile groups [Y. -K. Wang, Y. Chen, M. Coban, M. Karczewicz
(Qualcomm)]

JCTVC-G453 New definition for tile/slice boundary [K. Sugimoto, A. Minezawa, S.


Sekiguchi (Mitsubishi)]

JCTVC-G454 Parallel processing of ALF and SAO for tiles [K. Sugimoto, A. Minezawa, S.
Sekiguchi (Mitsubishi)]

Page: 150 Date Saved: 2011-12-04


JCTVC-G618 Line buffers problem in HEVC [Andrey Norkin, Rickard Sjöberg
(Ericsson)]
Note from track A: Consider context with G212 which was adopted and gets rid of line buffers except for
de-blocking – therefore G618 will be tested only in context of de-blocking.

JCTVC-G802 Cross-check of "AHG4: Non-cross-tiles loop filtering for independent tiles"


(JCT-VC-G194) [M. Horowitz, S. Xu (eBrisk)]

5.9.5 Wavefront parallel processing

JCTVC-G199 AHG4: Wavefront tile parallel processing [C.-W. Hsu, C.-Y. Tsai, Y.-W.
Huang, S. Lei (MediaTek)]

JCTVC-G627 Cross-check of JCTVC-G199, "AHG4: Wavefront tile parallel processing"


[A. Fuldseth (Cisco)] [late]

JCTVC-G722 Harmonization of entry points for tiles and wavefront processing [A. Segall,
K. Misra (Sharp)] [late]

JCTVC-G815 Category-prefixed data batching for HEVC wavefront and PIPE/V2V/V2F


coding [G. J. Sullivan (Microsoft)]

5.9.6 NAL unit header

JCTVC-G331 AHG 15: On NAL unit types and slice types [Y. -K. Wang, Y. Chen
(Qualcomm)]
At the previous JCT-VC meeting, it was agreed to change nal_ref_idc (2 bits) to nal_ref_flag (1 bit), and
change nal_unit_type from 5 bits to 6 bits. Consequently, the total number of hypothetically possible
NAL unit types doubled from 32 to 64. This document proposes an allocation of the 64 values to different
categories of NAL unit types, and raises some NAL unit type related questions for discussion.
Make access unit delimiters mandatory? (without a decoding process) (Put POC in there?) Put it in SEI,
and make SEI the first NAL unit of the picture?
Also, it was proposed to add slice types 3 to 5, with similar semantics as slice types 5 to 7 in AVC. No
action taken – it doesn’t seem clear that this trick had substantial value for AVC, and it was informally
reported that some encoders may have violated the constraint (if the values cannot be relied on, the trick
is useless).
Some other aspects were discussed in the contribution, such as slice-level SEI.
For further study.

Page: 151 Date Saved: 2011-12-04


JCTVC-G336 AHG 17: Unified NAL unit header design for HEVC and its extensions [Y.
Chen, Y.-K. Wang, M. Karczewicz (Qualcomm)]
In AVC extensions, such as SVC and MVC, NAL unit header extensions were added. This reportedly
made AVC, SVC and MVC substantially different from each other in terms of NAL unit headers. In
HEVC, the current NAL unit header contains basic syntax elements similar to AVC NAL unit header,
except for the temporal_id that indicates the temporal layer of the NAL units when the bitstream is
temporally scalable. The contributor said that if we follow the design principles in AVC scalable or
multiview extension, the HEVC extensions may have similar NAL unit header extensions.
In this proposal, a NAL unit header is proposed which is suggested to be used for both the HEVC non-
scalable bitstreams as well as the scalable bitstreams conforming to potential scalable or multiview
extensions of HEVC. This NAL unit header differs from the current HEVC NAL unit header in the
following properties: 1) Fixed NAL unit header length for one whole coded video sequence while the
length can vary across different coded video sequences; 2) Representation of the scalability syntax
elements in the NAL unit header, and when a particular syntax element it is not needed it is not present.
The scheme has a “NAL unit header parameter set” that determines the properties of the NAL unit
header.
A participant asked how the network would know which NUHPS is active.
It was commented that a “middle box” scanning a bitstream would need to be more intelligent and
capable of handling parameter sets and to be “stateful”, and that this seems undesirable. It was also
remarked that supporting flexible positions of syntax elements may be harder to handle than fixed-
position processing.
It was suggested to have a relationship to the concept proposed in JCTVC-E279.

JCTVC-G607 High-Level Syntax for Bitstream Extraction [R. Sjöberg, T. Rusert


(Ericsson)]
This contribution proposes changes to the NAL unit header and SPS, aiming at providing high-level
signalling to support bitstream extraction for both temporal scalability (as supported in the current WD)
and any other bitstream scalability (as defined in the future). Compared to the current WD, it is proposed
to remove the second byte from the NAL unit header, and instead introduce a NAL unit header extension
bit in the first byte. If the extension bit is equal to 0, the NAL unit header has only one byte. If the
extension bit is equal to 1, a byte-aligned SPS identifier is sent in the NAL unit header, which co-serves
as generic layer identifier and priority indicator.
Syntax for signaling of dependencies between scalable layers and layer properties such as temporal_id
and level_idc are introduced in the SPS. The proposed mechanism is claimed to be beneficial both in the
context of temporal scalability as supported in the current WD, and when HEVC is extended with new
scalability tools in the future. HM-based software implementation of the proposed signaling and bitstream
extraction are available as part of Ericsson’s response to MPEG’s Call for Proposals on 3D Video Coding
Technology.
It is claimed that setting the proposed extension bit to 0 in applications where no temporal (and no other)
scalability is required avoids a small but (asserted to be) unnecessary bit rate overhead that is present
according to the current WD. It is further claimed that if temporal scalability is supported in the bitstream,
the proposed mechanism enables encoders to provide a separate level_idc indication for each temporal
layer, and thus more accurate signaling of required decoder capabilities than in the current WD. It is also
claimed that when new mechanisms for scalability are introduced in the future, the proposed design can
be extended to provide signaling for both existing and new scalability mechanisms without changing the
NAL unit header. Finally, it is claimed that if temporal scalability is used (or other scalabilities in the
future), the proposed mechanism facilitates a bitstream extraction process that can associate both VCL
and non-VCL NAL units with scalable layers in a simple way.

Page: 152 Date Saved: 2011-12-04


It was noted that we have output_flag in the NAL unit header. It was remarked that this was to enable
uses such as indicating that a base layer is such low quality as to be not used for output.
The scheme includes sending an SPS ID in the NAL unit header.
The contribution raises some similar issues as with G336.
For further study.

5.9.7 Random access

JCTVC-G159 Random Access Detection and Notification [Hendry, S. Park, Y. Jeon, B.


Jeon (LG)]
The submitter indicated that “leading pictures” that follow a Clean Random Access (CRA) picture in
decoding order can still use reference pictures that preceded the CRA in both decoding order and output
order. It is suggested that when random access event occurs, a decoder may suddenly be fed with an
encoded stream started from a CRA NAL unit, and that the decoder should be told by syntax that
decoding leading pictures that follow the CRA. The current WD 4 of HEVC does not have any
mechanism for decoder to detect or to be aware of random access event. This contribution proposes two
mechanisms for random access detection and one mechanism to enable any application (e.g., video player
application) to notify HEVC decoder in syntax when random access occurs.
This is essentially a proposal to provide the same thing as an MPEG-2 “broken_link” flag.
It is proposed to put a flag for this in the NAL unit header.
Some possible types of indicators would be
 SEI
 Buffer description list
 NAL unit header flag
 Slice header
 System-level indicators (e.g., APIs) external to the bitstream
Some skepticism was expressed regarding the need for syntax support – some participants expressed the
view that a syntax indicator is not needed. It was remarked that we are generally especially reluctant to
use NAL unit header bits for such indicators.

JCTVC-G158 Undiscardable Leading Picture for CRA [Hendry, S. Park, B. Jeon (LG)]
In current WD 4 of HEVC, decoder flushes all reference pictures in Decoded Picture Buffer (DPB) prior
to decoding the first key picture that follows a CRA picture in decoding order. This contribution suggests
that in some use cases when some reference leading pictures should not be flushed out, which are called
Undiscardable Leading Pictures (ULPs), prior to decoding the key picture and are allowed to be used as
reference for inter prediction for pictures that follow the key picture in order to improve coding
efficiency. The contribution proposes some syntax and semantics of new elements to signal ULPs in
header of CRA slice. It is reported that by using special input sequences that contains scene change before
CRA picture, modified HM-4.0 that implements ULP concept gives gains 0.2 % Y, 0.2% U, 0.2 V for
RAHE and 0.3% Y, 0.1% U, 0.2% V for RALC.
Suggestion: This is already supported, by using a recovery point SEI message (already included in
HEVC, in principle), rather than using IDR or CRA.

Page: 153 Date Saved: 2011-12-04


JCTVC-G319 AHG15: Conforming bitstreams starting with CRA pictures [Y. Chen, Y. -
K. Wang, M. Karczewicz (Qualcomm)]
In the current HEVC design, CRA (Clean Random Access) pictures are identified by a new NAL unit
type. It is reportedly common that a device with a conforming decoder performs random access at a CRA
picture. However, a bitstream starting at a CRA picture is considered non-conforming, thus a conforming
decoder may not be able to properly handle such bitstreams.
In this proposal, it is proposed that a bitstream starting from a CRA picture can be conforming. Such a
conforming bitstream may or may not contain leading pictures associated with the CRA picture. A
leading picture associated with a CRA picture is a coded picture that follows the CRA picture in decoding
order but precedes the CRA picture in output order. The proposed normative changes include: 1) skipping
the decoding and output of the leading pictures associated with the starting CRA picture, when present; 2)
HRD modifications to guarantee that all bitstream conforming conditions are fulfilled by a conforming
bitstream starting with a CRA picture, regardless of whether the leading pictures associated with the CRA
picture are present.
It was remarked that it may be difficult to specify conformance of a bitstream in such a case, as there
would be implied dangling references to non-existing information. A bitstream needs to be testable for
conformance (without access to unavailable data).
The concept seems useful, if it is possible to specify the bitstream conformance clearly, and the decoder
behaviour as well. Further study is encouraged.

JCTVC-G533 On syntax for clean random access (CRA) pictures [Y. Park, H. Yang, C.
Kim (Samsung)]
This contribution discussed syntax for clean random access (CRA) pictures.
Some new material had been added in revisions of the original contribution.
The contributor assumed that the “leading pictures” are decoded and displayed, which is not the model of
what we refer to as CRA. As currently specified, decoder behaviour for random access is not specified (as
this functionality is considered out of the scope of the standard).
It was suggested that some of the concerns expressed in the contribution might be addressed by adding
some informative note in the standard to describe how a system can use CRA pictures (e.g. with some
external signal to indicate that random access is being performed or with some external discarding of
“leading pictures” from the bitstream before the decoder processes the remaining data.

JCTVC-G584 Temporal layer access pictures and CRA [Jonatan Samuelsson, Rickard
Sjöberg (Ericsson)]
The contribution explores the relationship between the current CRA design and temporal layering.
This contribution presents a proposal to change the signaling of Clean Random Access (CRA) pictures
and Temporal Layer Switching Points in what is referred to as Temporal Layer Access (TLA) pictures. It
is proposed to replace the CRA Network Abstraction Layer (NAL) unit type with a TLA NAL unit type.
The proposed TLA NAL unit type imposes constraints on the bitstream and does not have an impact on
the decoding process.
It is stated in the contribution that both random access information and temporal layer switching
information is of high value to a network node and thus should be available in NAL unit header,
independent of data outside that NAL unit header, specifically in the Sequence Parameter Set (SPS) and
Picture Parameter Set (PPS). It is further stated that a unified syntax and definition of TLA pictures
makes the standard text more readable and comprehensive.
It was remarked that a CRA must be an intra picture.

Page: 154 Date Saved: 2011-12-04


The proposal suggests, for temporal_id values greater than 0, to replace the current CRA design with a
temporal layer switching point indicator.
It was suggested that the current CRA design seems to not be useful for temporal layer 0. This seemed to
be generally agreed. So it may be interesting to explore an alternative specification for the syntax flag in
that case. Some misgivings were expressed about the specifics of the proposal, although the general idea
appears interesting and potentially useful.
It was noted that this proposal changes the temporal layer switching behaviour from what is currently
specified. This proposal disallows having a switching point picture that uses an earlier picture in the same
temporal layer as a reference, and it was questioned whether this is desirable or not.
Further study was encouraged (delegate to an AHG).

JCTVC-G834 Cross-check result of Samsung’s proposal JCTVC-G533 by LG Electronics


[Hendry, S. Park, B. Jeon (LG)] [late]
Not necessary to be presented – can be studied.

5.9.8 Decoded picture buffering and reference picture signaling

JCTVC-G157 Reference List Construction for Random Access Settings [Hendry, S. Park,
B. Jeon (LG)]
(This had not been reviewed in the BoG relating to DPB topics.)
In the 6th JCT-VC meeting, based on contribution JCTVC-F433 and JCTVC-F701, reference picture list
construction by using 3 higher quality and 1 nearest reference pictures has been adopted in the common
conditions for low delay settings. This contribution proposes to construct default reference picture lists
differently. When constructing RefPicListX, the proposed scheme suggests sorting reference pictures in
Decoded Picture Buffer by POC first and then by the picture-level QP value relative to the QP value of
current picture, instead of only by POC as it is done currently.
It was noted that the described behaviour is only correct for B pictures, and that the description is only a
matter of the default list order; the default can be changed by reference picture list modification syntax. If
this behaviour is desirable, it can be done explicitly by the encoder in this way.
It was remarked that this adds an extra sorting step and complication to the initialization.
It was noted that explicit mode reference picture marking can also be used to change the default list
initialization values.
It was noted that larger aspects of reference picture list construction are being considered, and remarked
that this may be an over-optimization relative to excessively emphasizing our common conditions
configurations, which are not part of the standard – rather, they are just a matter of how we are using the
standard in some example tests.
It was noted that the selection of QP values could end up being manipulated for purposes of reference
picture list construction, which seems like an unusual repurposing, and might result in sending more PPS
syntax structures so that this manipulation can be done.
The group found it interesting that gain was being reported due to using a different set and ordering of
reference pictures than what is our current common conditions. So there could be an opportunity here for
non-normative coding efficiency improvement of future common conditions (0.4% Y, 0.4% U, and
0.4% V for RA HE).
It was noted that G589 showed some gain (0.2%) relative to our current common conditions while using
much less picture storage. See notes elsewhere. Revisit tThis aspect was discussed again on Nov 29 (FB
chairing). Further study was encouraged.

Page: 155 Date Saved: 2011-12-04


JCTVC-G500 Cross-verification of LG's Reference List Construction for Random Access
Settings (JCTVC-G157) by Panasonic [C. S. Lim, S. M. T. Naing, V.
Wahadaniah (Panasonic)]

JCTVC-G166 AHG21: Explicit Reference Pictures Signaling with Output Latency Count
Scheme [Hendry, S. Park, B. Jeon (LG)]
Reviewed in BoG.

JCTVC-G832 Cross-check of "AHG21: Explicit Reference Pictures Signaling with Output


Latency Count Scheme"(JCTVC-G166) [Y. Park, I.-K. Kim, C. Kim
(Samsung)] [late]

JCTVC-G198 AHG21: Inter reference picture set prediction syntax and semantics. [T.K.
Tan, C.S. Boon (NTT Docomo)]
Document JCTVC-F493 proposed the explicit signaling of reference pictures needed for the inter
prediction of the current and future pictures, using buffer descriptions (reference picture sets). A
reference picture set is a set of ΔPOC values. ΔPOC values are picture order count (POC) of the reference
pictures relative the current picture. Template reference picture sets are signaled in the picture parameter
set (PPS) and referred to by each slice.
This contribution proposes to further reduce the amount of bits necessary for signaling the reference
picture set by predicting the ΔPOC values using the ΔPOC values from a reference picture set already
present in the PPS.
Based on the latest draft of the reference picture set syntax from the ad hoc group on Reference picture
buffering and list construction (AHG21), the number of additional PPS signaling bits needed for the
random access (RA) and low delay (LD) common conditions are reported to be 288 and 201 bits,
respectively. Using the proposed inter reference picture set prediction method, the numbers of bits needed
are reported to be reduced to 144 and 106 bits, respectively. This represents a reduction of 50% and 47%,
respectively.
It was noted that there are multiple inputs that remain under consideration regarding the APS and details
of the RPS design.
It was suggested that this may be a degree of over-optimization within the context of a scheme that is not
yet a really settled area of the design. There are also multiple ideas on the table that are available for
compressing the number of bits needed for the RPSs. G643 was suggested as one example. However, it
was also suggested that this proposal has been well studied and implemented, has good text, etc., and
seems relatively mature. It was remarked that without this proposal, the current G1002 scheme would
have an obvious redundancy in relation to cyclic picture structure encoding.
Decision: Adopted (Part 1 “full inter-RPS prediction”).

JCTVC-G314 AHG21: On DPB management [Y.-K. Wang, Y. Chen (Qualcomm)]


Reviewed in BoG.

JCTVC-G398 High-level Syntax: Marking process for non-TMVP pictures [B. Li (USTC),
J. Xu (Microsoft), H. Li (USTC)]
(Chaired by J. Boyce.)

Page: 156 Date Saved: 2011-12-04


In the current HM design, MVP and merge candidates may be incorrect when packet loss exists. The error
may influence motion vectors of all the following pictures. Disabling some pictures’ TMVP (non-TMVP
pictures) may stop the error propagation for motion vectors. However, it cannot be always guaranteed that
the error can be stopped with non-TMVP pictures. This contribution presents a mechanism to make sure
that the error can be stopped after non-TMVP pictures
enable_temporal_mvp_flag had been previously adopted at Torino meeting to the PPS, but not included
in the WD yet., Contribution proposes semantics changes for the use of that flag.
Suggestion to move enable_temporal_mvp_flag to SPS,which would have a cost of 3% if turned off, but
may be worthwhile in error prone conditions.
Decision: Adopted solution 2.
Will work with editors to incorporate both this adoption and the previous BoG adoption of the
enable_temporal_mvp_flag in the WD text and software.

JCTVC-G526 AHG21: Combined signaling for reference picture set [Y. Park, I.-K. Kim,
C. Kim (Samsung)]
Reviewed in BoG.

JCTVC-G546 On high-level syntax for maximum DPB size and frame latency [Y. Park, K.
P. Choi, C. Kim (Samsung)]
Chaired by J. Boyce.
JCTVC-E339 proposed to move max_dec_frame_buffering and num_reorder_frames from the optional
VUI to mandatory SPS. The JCTVC-F541 proposed to add max_latency_frames_plus1 or
max_latency_increase_plus1. We propose move max_dec_frame_buffering and add
max_latency_frames_plus1 in SPS. We propose the num_reorder_frames to be left in VUI without
change.
If an encoder doesn’t send the VUI parameters, capabilities determination by decoder would be
negatively impacted.
Without max latency, output is delayed.
Suggestion to also move num_reorder_frames to SPS.
Decision: Adopt put three syntax elements in the SPS, max_dec_frame_buffering, num_reorder_frames,
and use max_latency_increase . (Also JCTVC-G779)

JCTVC-G779 Proposed constraint on reordering latency (for further consideration of


JCTVC-F541) [G. J. Sullivan (Microsoft)]
Chaired by J. Boyce
This contribution repeats the content of JCTVC-F541 to propose to add an SPS-level parameter in HEVC
that expresses a constraint on the maximum amount of reordering that can be applied to any frame in a
coded video sequence. By comparing the latency status of each frame in the DPB to the value of the
maximum latency constraint, a decoder can determine when the maximum latency limit has been reached,
and can immediately output any frame that has reached this limit. It is asserted that this can enable the
decoder to more rapidly identify frames that are ready for output than with the current syntax for a variety
of video encoding structures that includes typical cases. It is also asserted that directly expressing such a
limit on the amount of reordering latency allowed through the encoding-decoding process would be a
useful characteristic to be established for system-level negotiation and characterization purposes.
This proposal essentially just repeated the content of JCTVC-F541 of the previous meeting, as the
disposition of that contribution at the previous meeting was to essentially allow time for further study.

Page: 157 Date Saved: 2011-12-04


Suggestion to express value as POC difference rather than picture count difference.
Covered in JCTVC-G546.

JCTVC-G548 AHG21: Construction and modification of predefined reference picture sets


and reference picture lists [V. Wahadaniah, C. S. Lim, S. M. T. Naing
(Panasonic)]
Reviewed in BoG.

JCTVC-G635 Coding with a unified reference picture list [M. Naccari (BBC), G. Van
Wallendael (Ghent University), M. Mrak, D. Flynn (BBC)]
The unified reference picture list (LU - List Unified) was presented in contribution JCTVC-F549 with the
aim of providing a simpler and more flexible structure to map the reference picture used during the inter
coding process in the HM codec. The main idea behind the LU is to simplify mapping of reference
pictures by using only a single reference list whereby reference frame pairs are stored. A reference pair
consists of two reference frames (in the case of bi-directional prediction) or one reference frame and a
null element (in the case of uni-directional prediction). It was asserted that the usage of the LU reduces
the bitstream parsing and enables adding/removing some combinations of references in a more flexible
fashion than the current HM design using two reference lists. In this context, this contribution addresses
the reference list indexes usage in the current HM 4.0 codec and describes an implementation of the LU
scheme based on the default HM reference settings. It is reported that the experimental results obtained
for this implementation show that the LU scheme can handle usual HM conditions, while providing a
space for more flexible selection of reference frames.
It was asked what is the impact of the scheme on coding efficiency? Some loss in compression was
reported. It was suggested that this was due to the fact that the different method of coding of the reference
picture indexes was not included in the R-D decision-making process. (At the previous meeting, some
gain had been shown when using a similar but slightly different scheme, in a usage that included the
scheme within the R-D optimization.)
It was remarked that this may require a difficult coupling of the encoder’s decision-making process of
joint selection of the two reference pictures to use for references. For example, if there are 10 pictures in
each list, then there would be 120 entries needed in the combined list. Initalizing, reordering, and
managing such a large list might get difficult. The overhead for reference picture list reordering might be
large.
It was remarked that the only clear benefit would seem to be simplification of the parsing of the reference
indexes at the PU level. Further study would be needed to identify and clarify whether a significant
benefit can be shown for this concept.

JCTVC-G991 Cross-check of contribution JCTVC-G635: Coding with a unified reference


picture list [J. Jung (Orange Labs)] [late]

JCTVC-G549 Syntax rearrangement for list combination [Y. Park, S. Jeong, C. Kim
(Samsung)]
A list combination (LC) scheme was proposed for uni-prediction at B-slices to improve coding efficiency.
The current syntax bit ref_pic_list_combination_flag seems a bit redundant.
Proposes to put a default combined list length in the PPS and modify the slice-level syntax and remove
that bit.
There were some differences in the proposed design aspects in the newer version of the proposal.

Page: 158 Date Saved: 2011-12-04


Another way to deal with that bit was suggested – further study was suggested.

JCTVC-G717 Improvements on reference picture buffering and list construction [Y. Yu,
K. Panusopone, X. Fang, L. Wang (Motorola Mobility)]
This contribution proposes changes of reference picture construction of combined list and an explicit way
for signalling collocated pictures according to the value of delta POC. The proposed scheme was reported
to be more efficient to build the combined list and signal the collocated picture.
The proposal assumes the RPS (G021) style of buffer control.
It is proposed to allow any picture within list 0 or list 1 (or perhaps within the RPS list) to be specified to
be the “collocated picture” (using an index syntax element in the RPS syntax or in the slice header).
Currently the collocated picture is always the first picture in list 0 or the first picture in list 1.
Some tests were done to see that the proposed technique works; however, there was no compression
benefit shown overall. Further study would be needed to determine whether there may be a significant
benefit for this concept.
Also proposed was a change of the default order of combined reference picture lists, based on pair-wise
minimization of POC distance. Test results were not provided, so the work seemed somewhat
preliminary, and further study would be needed to determine whether it has value.

JCTVC-G637 AHG21: Long-term pictures and pruning of reference picture sets [Rickard
Sjöberg, Jonatan Samuelsson (Ericsson)]
Reviewed in BoG.

JCTVC-G643 AHG21: On reference picture list construction and reference picture


marking [M. M. Hannuksela, S. M. Gopalakrishna (Nokia)]
Reviewed in BoG.

JCTVC-G668 POC type 1 [M. M. Hannuksela (Nokia)]


This contribution proposes POC type 1 as specified in H.264/AVC, which can be summarized in a
simplified form to use frame_num in the slice header to index a POC pattern provided in the sequence
parameter set. In order to support POC type 1, it is proposed to include the frame_num syntax element
into the slice header and the gaps_in_frame_num_value_allowed syntax element into the sequence
parameter set when the use of POC type 1 is indicated in the sequence parameter set.
It is asserted that POC type 1 offers the opportunity to encode POC values of regular GOP structures
more efficiently compared to POC type 0. Moreover, it is asserted that any POC type coding may be used
with the reference picture lists and marking proposals in JCTVC-F493 and its later revisions and JCTVC-
G643.
The decoder reference picture marking in AVC and its “non-existing pictures” are not part of this
proposal.
It was remarked that POC type 1 requires frame_num, which is a syntax element that is not needed
anymore with POC type 0 as proposed in JCTVC-F493, and that the need for this data reduces the
efficiency benefit of using POC type 1.
The frame_num would not be present when POC type 0 is used.
It was suggested that it may be desirable to test how many bits this would actually save. Such further
study is encouraged, although some skepticism was expressed regarding whether having this POC type
supported is desirable.

Page: 159 Date Saved: 2011-12-04


JCTVC-G713 AHG21: Long term picture referencing using wrapped POC [K. Misra, S.
Deshpande, A. Segall (Sharp)]
Reviewed in BoG.

JCTVC-G714 AHG21: On picture referencing [K. Misra, S. Deshpande, A. Segall (Sharp)]


Reviewed in BoG.

JCTVC-G788 AHG21: Proposal on Decoded Picture Buffer Description Syntax Relating to


JCTVC-F493 [G. J. Sullivan, Y. Wu, J. Xu (Microsoft), B. Li (USTC)]
Reviewed in BoG.

JCTVC-G1036 Common conditions for reference picture marking and list construction
proposals [Y.-K. Wang, M. M. Hannuksela, T.K. Tan (editors)] [late] [miss]
qq

5.9.9 HRD issues

JCTVC-G188 Enhancement on operation of coded picture buffer [K. Kazui, S. Shimada, J.


Koyama, A. Nakagawa (Fujitsu)]
This contribution proposes to enhance the operation of the coded picture buffer (CPB), following up on
prior proposals C021, D053.
The specification of CPB operation in WD4, inherited provisionally from the AVC specification, only
defines picture-based timing of bitstream operation, where the decoder does not process the picture until
it has completely arrived.
The proposed operation is a sub-picture based bitstream removal from CPB in order to realize
interoperable very low delay coding – less than 1 picture period.
The degree of realism of achieving a delay of less than 1 picture period was discussed, along with some
discussion of typical and achievable delays in videotelephony and local display applications.
It was remarked that the instantaneous decoding time assumption of the current HRD is a clearly
unrealistic assumption.
The proponent suggested, as an example, an HRD model in which the HRD decoder would operate in
units of LCU rows instead of whole pictures.
It was remarked that CABAC is not a “low-latency” entropy coder. Slice-level termination of CABAC
was proposed a way to define the HRD (using the end_of_slice_flag position). For HRD purposes, the
cabac_zero_word and RBSP trailing bits might also be counted.
Specific syntax and text specification was proposed.
The concept was of interest to a number of participants. The idea seems coupled with the concept of
having a decoder that does not operate instantaneously, and the desire is to avoid phenomena such as the
case where nearly all of the bits for a picture are for the last LCU in the picture.
At this point the group did not seem to have studied the subject well enough to have a clear consensus on
the value of the concept or the specifics of the proposed text. Further study was encouraged.

Page: 160 Date Saved: 2011-12-04


5.10 Quantization

5.10.1 QP prediction / delta QP coding

JCTVC-G213 Non-CE4: Performance comparison between temporal QP prediction and


temporal MV prediction [H. Aoki, K. Chono (NEC)]
TBA
Temporal MV prediction does not really seem analogous to QP prediction. Temporal QP prediction
seems to have loss resilience issues.

JCTVC-G936 Cross verification of NEC's proposal JCTVC-G213 on TMVP [Y. Lin, J.


Zheng(HiSilicon)] [late]

JCTVC-G353 Non-CE4: QP prediction using spatial QP [H. Nakamura, M. Nishitani, S.


Fukushima, T. Kumakura (JVC Kenwood)]
In the HM4.0, the predictive QP is always the QP of left neighbor quantization group for delta QP
signaling. In this contribution, it is proposed to calculate the predictive QP based on the QP of previously
coded neighboring quantization groups. It is reported that 0.3%, 0.2%, 0.3%, 0.3%, 0.4%, and 0.5%
luminance BD-rate reductions were obtained by the proposed method for AI HE, AI LC, RA HE, RA LC,
LD HE, and LD LC configurations, respectively, when tested with the QP setting scheme used in CE4.
Relates to CE4 1.3.c (G067). For AI, 1.3.c did a little better. Regarding complexity, this proposal seems
somewhat simpler.

JCTVC-G772 Cross-check result of JCTVC-G353 [K Sato (Sony)] [late]

JCTVC-G357 Non-CE4: Improvement of Intra prediction based QP prediction [J. Xu, A.


Tabatabai, K. Sato (Sony)]
Proposed improvement of 1.3.c (G067) with special treatment of 45 degree case, benefit relative to HM in
CE4 test conditions, obtained 0.5% for AI HE (G067 had about 0.4%).

JCTVC-G739 Non-CE4: Cross-verification report for improvement of intra prediction


based QP prediction (JCTVC-G357) [H. Aoki, K. Chono (NEC)] [late]

JCTVC-G362 Non-CE4: Efficient binarization of delta_QP in CABAC with signaling of


Max and Min QP [J. Xu, K. Kondo, K. Sato, A. Tabatabai (Sony)]
Related to 1.2.a (min and max QP signalling, G462), 0.1% better than G462 as tested.

JCTVC-G956 Non-CE4: Cross-verification of Sony's proposal JCTVC-G362 by Nokia [J.


Kang, A. Hallapuro, J. Lainema (Nokia)] [late]

Page: 161 Date Saved: 2011-12-04


JCTVC-G850 Non CE4: Fine granularity QP offset [X. Li, X. Guo, S. Lei (MediaTek), X.
Wang, R. Joshi, G. Auwera, M. Karczewicz (Qualcomm)]
Relates to CE4 Subtest 2, discussed together with those proposals.

JCTVC-G1040 non-CE4: Crosscheck of updated fine granularity QP offset (JCTVC-G850)


by Intel [Y. Chiu, W. Zhang (Intel)] [late] [miss]

JCTVC-G979 Consideration on the Hybrid Structure of Channel, Scene, and Object based
3D Audio Systems [Jeongil Seo, Kyeongok Kang] [late]

JCTVC-G451 Qp offset for intra block [K. Sugimoto, A. Minezawa, S. Sekiguchi


(Mitsubishi)]
This contribution proposes to add a syntax signaling a qp offset value for intra blocks at the slice header.
Syntax and semantics of the proposed offset is provided. Two particular examples were described:
 Vertical intra column refresh for very-low-delay operation
 Generally using smaller QP with intra for visual quality optimization
The offset value was proposed to be added to the slice header.
It was remarked that it might be desirable to have a flag at the SPS level to indicate the presence or
absence of the flag at the slice header level.
It was noted that it is necessary to understand how this affects the prediction of subsequent CU QPs. This
does not seem to have been accounted for in the contribution. Further study was encouraged.

JCTVC-G509 Support of ChromaQPOffset in HEVC [S. Liu (MediaTek), K. Sato (Sony)]


In AVC there is chroma qp offset and second chroma qp offset, sent at the PPS level with a range of
+/−12.
It is proposed to use the same scheme as in AVC.
In principle, this seems to fall into the category of something that we already have intended to be in our
design, but may not have integrated into the text yet. (For monochrome, the syntax elements should not
be sent.) This was agreed.
Some test results were shown (some similar results are in G401).

JCTVC-G907 Crosscheck of JCTVC-G509 - Support of ChromaQPOffset in HEVC [E. S.


Ryu (InterDigital)] [late]

JCTVC-G759 Cross-check report for ChromaQPOffset [H. Sasai, T. Nishi (Panasonic)]


[late]

Page: 162 Date Saved: 2011-12-04


JCTVC-G550 TU_size based dQP [C. Kim, Y. Park, K.P. Choi, J.H. Park (Samsung)]
Suggests to establish offsets to QP that apply, depending on TU size. For example, a large block size
might use a smaller QP to get good fidelity in smooth or well-predicted regions. Visual quality was not
tested. No change of PSNR was observed. Lambda was not modified when QP was not modified.
Related to G721 and G893 and G451. Seems interesting. For further study. Plan to establish AHG and/or
CE.

JCTVC-G925 Cross-check of JCTVC-G550 on TU_size based dQP [V. Seregin


(Qualcomm)] [late]

JCTVC-G893 Unified QP signaling [X. Fang (Motorola)] [late]


Suggests ability to change QP at TU branch or leaf level as well as at CU branch or leaf level. Relates to
CE4 subtest 1.1.a G721, but with some modifications to constrain the usage. For further study.

JCTVC-G1028 Non-CE4: Rate control friendly spatial QP prediction [H. Aoki, K. Chono
(NEC), M. Kobayashi, M. Shima (Canon), M. Coban, M. Karczewicz
(Qualcomm), K. Sato (Sony)] [late]
Compared to preceding QP in scan order, more gain would be achieved by this scheme than for the
comparison to HM 4. Further study in a CE.

JCTVC-G1029 Non-CE4: Crosscheck for JCTVC-G1028 by JVC KENWOOD [H.


Nakamura (JVC Kenwood)] [late]

5.10.2 Quantization matrices

JCTVC-G094 Non-CE4: Carriage of large block size quantization matrices with up-
sampling [M. Zhou, V. Sze (TI)]
Upsampling of quant matrices (somewhat discussed elsewhere). Detailed presentation did not seem
necessary at this time.

JCTVC-G506 Non-CE4: Crosscheck report of carriage of large block size quantization


matrices with up-sampling proposed by TI (JCTVC-G094) [M. Shima
(Canon)]

JCTVC-G152 Method and syntax for quantization matrices representation [X. Zhang, S.
Liu, S. Lei (MediaTek)]
Detailed presentation did not seem necessary at this time.

JCTVC-G529 Non-CE4: Cross-Check for MediaTek quantization matrices representation


(JCTVC-G152) [J. Zheng (HiSilicon)] [late]

Page: 163 Date Saved: 2011-12-04


JCTVC-G711 Non-CE4: Cross-verification of MediaTek's proposal JCTVC-G152 on
method and syntax for quantization matrices representation [M. Zhou (TI)]
[late]

JCTVC-G295 Non-CE4 Subtest3 : Extension of Adaptation Parameter Sets syntax for


Quantization matrix [J. Tanaka, Y. Morigami, T. Suzuki (Sony)]
Detailed presentation did not seem necessary at this time.

JCTVC-G352 Parameterization of Default Quantization Matrices [E. Maani, M. Haque, A.


Tabatabai (Sony)]
Default quantization matrices have to be stored in the decoder for the case where no explicit visual
scaling matrices are transmitted in the bit stream. These matrices, in the AVC case were explicitly defined
and stored in decoder since the transform sizes included only 8x8 and 4x4 transform. However, in HEVC,
it was suggested that with additional 16x16, 32x32, and non-square transforms the memory size to store
these matrices may be excessive. This document presents a method to parameterize the default matrices
such that they can be efficiently stored at the decoder memory. When these matrices are needed they can
be derived using simple operations. Using this technique can reportedly save up to 8 KB of ROM
compared to explicit representation of the matrices (reducing the storage of values from 8160 elements of
8 bits each to 96 elements of 15 bits each).
The four suggested parameters can be stored as fixed values or sent in syntax.
One participant remarked that using a sub-sampled matrix might be preferable if it would avoid the need
to use the full matrix while using it (regardless of how the matrix is generated or stored). Recomputing
the matrix on the fly does not seem appropriate.
For further study.

JCTVC-G530 Non-CE4: Layered quantization matrices compression [Y. Wang


(Tsinghua), J. Zheng, X. Zheng (HiSilicon), Y. He (Tsinghua)]
Detailed presentation did not seem necessary at this time.

JCTVC-G730 Non-CE4: Cross check of layered quantization matrices compression in


JCTVC-G530 [T. Suzuki (Sony)] [late]

JCTVC-G992 Cross-check of JCTVC-G530 – Layered quantization matrices compression


[X. Zhang, S. Liu (MediaTek)] [late]

JCTVC-G578 Non-CE4: Quantization matrix compression and signaling [R. Joshi, J. Sole,
M. Karczewicz (Qualcomm)]
Detailed presentation did not seem necessary at this time.

JCTVC-G826 Non-CE4: Cross check of quantization matrix coding in JCTVC-G578 [T.


Suzuki (Sony)] [late]

Page: 164 Date Saved: 2011-12-04


JCTVC-G658 Quantization matrices in fragmented APS [Y. Chen, Y.-K. Wang, R. Joshi,
M. Karczewicz (Qualcomm)]
Trying to minimize the quantity of data in the APS.

JCTVC-G880 HVS Model based Default Quantization Matrices [M. Haque, A. Tabatabai,
Y. Morigami (Sony)] [late]
This document presents a set of default Quantization Matrices designed by using a Human Visual System
(HVS) Model for HEVC. A list of these matrices are provided in the appendix. The contributor reported
that the proposed matrices provided some subjective quality benefit (in informal subjective testing).
Decision: Adopt these as the starting point default values for 16x16 and 32x32 and use AVC defaults for
the other cases.

JCTVC-G1026 Using Multiple APSs for Quantization Matrix Parameters Signaling [Ming
Li, Ping Wu (ZTE), Junichi Tanaka, Yoshitaka Morigami, Teruhiko Suzuki,
Kazushi Sato (Sony)] [late]
This late contribution was provided as a step toward dealing with the issue of conditional update of APSs.
It was agreed that this should be studied as part of future AHG activity.

5.10.3 Dequantization

JCTVC-G154 De-quantization without Rounding Offset [X. Zhang, S. Liu, S. Lei


(MediaTek)]
This contribution proposes to remove the round offset from the dequantization process. By doing this, one
addition operation can be saved for each coefficient and in many cases, one shift operation can be saved
for each coefficient as well. Experimental results report identical BD-rates under common test condition.
Very small bit impacts on bit rate and PSNR (up to 0.x% bit rate increase and/or up to 0.04 dB PSNR
degradation at >50dB PSNR quality) are reported for some very high bit rate (small QP) cases.
Reconstruction rule in current software is ((piQCoef[n] * scale + iAdd) >> iShift, with
iAdd = 1 << (iShift−1).
If the scale is a power of 2, adding iAdd or not adding it are mathematically the same.
In the common conditions, there is only one very small QP.
Analyzing various very small QP values, the worst identified case (qp=9) had somewhat more than 0.7%
loss.
It was asked whether QP=1 was tested. This wasn’t known.
It was noted that on some implementation architectures, the add comes for free.
For further study.

JCTVC-G816 Cross verification of JCTVC-G154, De-quantization without Rounding


Offset [R. Cohen (MERL)]

Page: 165 Date Saved: 2011-12-04


JCTVC-G555 One-addition dequantization [N. Shlyakhov, I.-K. Kim, J. H. Park
(Samsung)]
Generic set of integer multipliers {5, 6, 7, 8, 9} is proposed for dequantization. Multiplication by each
value of this set may be performed with no more than 2 shifts and 1 addition or subtraction. This set of
multipliers increases bit-depth of dequantized transform coefficient by no more than 4 bits. It is reported
that current QP control schemes of common test conditions are compliant with the proposed quantization
modification.
Closely related to G154.
Proposes to change the period of 6 in the QP increments to be a period of 5 and reduce the range of QP
values.
Proposed to change the chroma QP offset.
Eliminates the multiply in the dequantization (but requires customized shifts).
It was remarked that this doesn’t seem to work with quantization weighting matrices.
It was remarked that in software this would not have a benefit.
It was also remarked that, in general, there are other areas where there is more opportunity for meaningful
complexity reduction.
For further study.

JCTVC-G720 Dequantization with symmetric reconstruction points [L. Kerofsky, A.


Segall, K. Misra (Sharp)]
This contribution notes that the dequantization formula currently used in HM 4 has the property that the
reconstructed values are not always symmetrically distributed about zero. A modification to the
dequantization formula is proposed which creates reconstructed values symmetrically distributed about
zero under all conditions. The modified dequantization process operates on the absolute value of levels
and then applies the sign rather than operating on signed levels. For quantization parameters of the
common test conditions, the dequantization results are unchanged for any level or block size.
The dequantization formula presently used in HM is based on JCT-VC-E243. This formula is
summarized below:

Definitions
B = source bit width (8 or 10 bit in the experiments described below)
DB = B-8 (internal bit-depth increase with 8-bit input)
N = transform size
M = log2(N)
Q = f(QP%6), where f(x) = {26214,23302,20560,18396,16384,14564}, x=0,…,5
IQ = g(QP%6), where g(x) = {40,45,51,57,64,72}, x=0,…,5

coeffQ = ((level*IQ << (QP/6)) + offset)>>(M-1+DB), offset = 1<<(M-2+DB)

It is asserted that the worst case dequant has a 17 bit signed multiply range.

Page: 166 Date Saved: 2011-12-04


It was remarked that this may imply that the quantized level values also have a 17 bit range, which seems
troubling. The symmetric rounding approach does not solve that issue.
We may want a clipping at the input of the dequant operation (see discussion in G719).
We would seem to want such a scheme if we have adaptive offset addition.
It was noted that imposing symmetry at the input of the first stage of the inverse transform does not mean
that we will have symmetric behaviour at the output of the final inverse transform.
Proposes to instead use
coeffQ = Sgn(level)*((|level|*IQ << (QP/6)) + offset)>>(M-1+DB), offset = 1<<(M-2+DB)
Our understanding is that symmetric reconstruction is already what is proposed in the adaptive offset
schemes.
It doesn’t seem clear that we would need this change if we do not use an adaptive offset scheme.
For further study.

JCTVC-G723 Support for finer QP granularity in HEVC [K. Panusopone, A. Luthra, Xue
Fang, J. H. Kim, L. Wang (Motorola Mobility)]
Closely related to G721, does not need further detailed review.

5.11 Alternative coding modes

5.11.1 Lossless coding and I_PCM

JCTVC-G093 AHG22: Sample-based angular prediction (SAP) for HEVC lossless coding
[M. Zhou (TI)]
Efficient lossless coding is required for real-word applications such as automotive vision, video
conferencing and long-distance education. This contribution proposes to use sample-based angular intra
prediction (SAP) in lossless coding mode for better coding efficiency. The proposed sample-based
prediction is exactly same as the HM4.0 block-based angular prediction in terms of prediction angles and
sample interpolation, requires no syntax or semantics changes, but differs in decoding process in terms of
reference sample selection. In the proposed method a sample to be predicted uses its direct neighboring
samples for better intra prediction accuracy. Compared to the HM4.0 anchor lossless method which
bypasses transform, quantization, de-blocking filter, SAO and ALF, the proposed method provides an
average gain of 6.71% in AI-HE, 7.83% in AI-LC, 2.29% in RA-HE, 2.64% in RA-LC, 1.57% in LB-HE
and 1.85% in LB-LC. For class F sequences only, the average gain is 8.95% in AI-HE, 8.88% in AI-LC,
5.64% in RA-HE, 5.61% in RA-LC, 4.58% in LB-HE and 4.61% in LB-LC. SAP is fully parallelized on
the encoder side, and can be executed at a speed of one row or one column per cycle on the decoder side.
Presentation not uploaded
The anchor is a modified HM with transform bypass, quant. bypass as described above, entropy coding is
used as is, i.e. the contexts are becoming spatial. Directional prediction “as is” is included here for intra
coding.
Goal is to use HM tool “as is” without major re-design.

JCTVC-G115 AHG22: Cross-verification of TI’s Sample-based angular prediction (SAP)


for HEVC lossless coding (JCTVC-G93) [K. Chono, H. Aoki (NEC)]

Page: 167 Date Saved: 2011-12-04


JCTVC-G118 Potential enhancement of signaling of I_PCM blocks [K. Chono, H. Aoki
(NEC)]
This contribution presents the concept-level idea of a method for transmitting successive I_PCM blocks.
The method counts the number of successive I_PCM blocks in the Z-scan order in the same layer of the
CTB and signals that number minus 1 at the PU header of the first I_PCM block. PCM sample data of the
subsequent I_PCM blocks immediately follow that of the first I_PCM block, that is, signaling of the PU
headers of the subsequence I_PCM blocks is skipped. Thereby the method enhances the throughput of
transmitting PCM sample data of successive I_PCM block while reducing some bits of side-information.
Currently, an IPCM block consists of variable-length header data and fixed-length PCM payload.
Suggestion to bundle several headers and several payloads together, to increase the length of concatenated
fixed-length data. Maximum number of concatenated IPCM blocks is 4.
Main advantage is for more deterministic buffer behaviour.
Better understanding of application needed – further study (AHG – lossless coding? Or other?)

JCTVC-G268 AHG22: Lossless Transforms for Lossless Coding [W. Dai, M. Krishnan, J.
Topiwala, P.Topiwala (FastVDO)]
This submission is based on proposal G266, “Lossless Core Transforms for HEVC,” and focuses on the
lossless transforms themselves. This work is submitted in reference to AHG22 on Lossless Coding. A
coding framework has been created in that AHG for development purposes, which at this time includes
prediction, and entropy coding, but bypasses transform and quantization. It is the purpose of this proposal
to supply lossless transforms, which can assist in data decorrelation, and potentially improve the
performance of the lossless coding framework. Such experiments will be conducted and reported in the
next meeting cycle.
Explain ways how to construct lossless transforms: Lifting.

JCTVC-G664 AHG22: A lossless coding solution for HEVC [W. Gao, M. Jiang, Y. He, J.
Song, H. Yu (Huawei Technologies)]
This contribution proposes a lossless coding solution that only involves a few modifications to the current
HEVC WD. To achieve lossless coding in both intra and inter coding operations, the transform and
quantization modules are bypassed. Due to the nature of lossless coding, the existing intra predictions are
extended to pixel based prediction (DPCM), taking into account that there is no transform applied to
prediction residuals. Furthermore, since the statistical properties of prediction residuals are quite different
from those of transform coefficients, the CABAC coding of intra prediction residuals are also modified
accordingly.
No additional flag is introduced in this proposal, the lossless coding mode is signaled by QP Y=0. As a
result, no change to the HEVC syntax specification is needed, and the lossless mode can be applied to the
entire picture or to individual CUs conveniently.
Combination of CABAC and Golomb-Rice is used.
Anchor is HM with QP=0 which is still lossy. Compared to that, the proposed method saves roughly 9.5%
for AI and roughly 7% for LD B

JCTVC-G885 Cross-check of JCTVC-G664 [B. Li (USTC), J. Xu (Microsoft)] [late]

Continue study in AHG.

Page: 168 Date Saved: 2011-12-04


Unclear whether lossless coding would be supported in a profile/level? If it can be done “for free” it may
be attractive, but this may be in contradiction with bitrate limitations that are usually defined in levels.

5.11.2 Variable resolution coding

JCTVC-G264 AHG18: Adaptive Resolution Coding (ARC) [T. Davies (Cisco), P. Topiwala
(FastVDO)]
This contribution reports the results of further investigations into resolution adaption. Adaptive
Resolution Coding (ARC) is described, in which resolution is selected dynamically by means of a
threshold test, and the compression performance investigated in a rapid bit-rate reduction/high QP
scenario against simply increasing QP with and without pre-filtering. Mean luma BD-Rate gains range
from 6.9-14.3% in Class A and 3.5-11.5% in Class B but performance depends greatly on picture content
and configuration. Individual luma BD-rate gains can range from negligible up to 30%. Typically chroma
has lower objective quality by a fixed penalty of around 0.5dB but this is asserted to have little subjective
impact. The contribution includes modified Working Draft text.
Resolution can be changed in both directions (up or down)
Heuristic criterion based on PSNR
Results for high QP
PSNR is computed at high resolution
Could the same be achieved by pre-filtering? It is said that some investigations on this were performed,
but same performance was never achieved
Upsampling and downsampling filters would need to be normative (for the results, down- and upsampling
filters were aligned) – in contrast to that, in scalable coding only upsampling needs to be defined
An application scenario is transmission with large variation of bandwidth
A worst case would be when resolution changes every frame, which would require a more complex
decoder – would be necessary to restrict frequency of changes
Would the decoder need to keep both resolutions all the time? Can down- and upsampling be performed
on the fly in any case of frame structure?
One expert mentions that upsampling and downsampling filters should be consistent with possible
scalable extension. Put studying relation with scalable coding in mandates of AHG.
From Tue. 29th discussion:
Normative (in-loop) down- and upsampling is only necessary at the switching points (which should be
rare). Therefore, it may not be necessary in the context of ARC to define sophisticated
decimation/interpolation filters.
One expert mentions that reduced DPB memory could also be an advantage, where only lower resolution
references are stored, but prediction would be performed at high resolution.

JCTVC-G597 AHG 18: Cross-check of Adaptive Resolution Coding (ARC) (JCTVC-G264)


[Glenn Van Wallendael, Jan De Cock, Rik Van de Walle (Ghent Univ.),
David Flynn (BBC)] [late]

Page: 169 Date Saved: 2011-12-04


JCTVC-G329 AHG18: Comments on the Implementations of Resolution Adaption on
HEVC [M. Li, P. Wu (ZTE)]
This document presents the analyses of signaling and potential decoding rules for motion compensated
prediction (MCP) if resolution adaption (RA) is integrated into the current HEVC design. The designs in
the existing H.263 and MPEG-4 Part 2 standards, especially with two modes involving resolution
changes, i.e., reduced resolution update (RRU) and dynamic resolution conversion (DRC), are studied as
the references for identifying the areas where the standard texts and the coding algorithms will require
specific changes to enable the resolution adaption as a video coding feature. Therefore, firstly, some key
check items on MCP are listed, and the changes in the syntax and coding rules brought by RRU and DRC
to H.263 and MPEG-4 Part 2 are analyzed, respectively. Then some heuristic principles are described
based on the common features distilled from the designs of RRU and DRC in order to achieve compact
HEVC syntax design for RA. Finally, following these principles, all the key issues regarding MCP with
RA integration into HEVC are elaborated with some recommendations.
Presented Tuesday 29th afternoon
Contribution meant as information document. It is suggested to change the resolution of the reference
pictures (not of residual pictures as in the RRU of H.263) – similar as in G264.No presenter available
Friday 2235 hours and Monday 28th 2025 hours

JCTVC-G715 AHG18/21: Absolute signaling for resolution switching [K. Misra, S.


Deshpande, L. Kerofsky, A. Segall (Sharp)] [late]
This document describes a technique which enables signaling of reference picture resolutions to be
maintained in the decoded picture buffer.
presented Monday 28th evening
Relates to DPB signaling – which resolution of the reference picture should be stored in DPB: Original,
subsampled or both resolutions.
Does not relate to how the resolution switching is managed at the coding layer.

JCTVC-G862 Advanced Resampling Filters For HEVC Applications [W. Dai, M.


Krishnan, P. Topiwala (FastVDO)]
Resampling filters are commonly used in a variety of image processing tasks. They are intimately related
to two- and multi-channel filter banks, as well as to the well known Laplacian Pyramid. They arise in
video coding applications in several instances: (a) spatial scalability, (b) resolution adaptation, and (c)
color sampling, as part of 4:4:4 coding. All three application spaces are currently being explored in the
HEVC project. We propose a single, general design methodology, and specific sampling filter instances,
which can reportedly meet all of these demanding applications simultaneously. This proposal builds on
previous work proposed at the Torino meeting.

Revisit: Conclusions on ARC were not possible Mon 28th evening as the relevant experts were not
present. Continue AHG
 Investigate use of simple filters (e.g. bilinear) for the down- and upsampling switching
 Investigate subjective quality, as frequent switching may be annoying
 Investigate necessity of keeping both resolutions in buffer
 Relationship with alternatives: pre / post processing

Page: 170 Date Saved: 2011-12-04


Discussion about the BD gains reported:
 For some sequences, the effective resolution is lower than pixel density, such that gains by
encoding at lower resolution are found even at high rates
 Comparing PSNR values at different resolutions is problematic
?

5.11.3 Transform skip

JCTVC-G575 Transform skip mode [M. Mrak, A. Gabriellini, D. Flynn (BBC)]


This contribution is related to the work of AdHoc Group 19: Transform Skipping. Starting from the initial
Transform Skip Mode (TSM) proposal (JCTVC-F077), this contribution reports a number of
harmonization and evaluation aspects of TSM. The harmonization aspects include interaction with RQT,
NSQT, scans, and signaling. Also, a detailed study of skip transform throughput is provided and a new
implementation of transform skipping is presented achieving the same compression performance of the
original implementation while providing stricter guarantees in terms of data not overflowing 16-bit
registers for worst case input sets. The tests evaluate the compression performance of TSM at different
levels of complexity. In all tested configurations gains are the highest for class F, showing BD-rate gains
of up to 4.6% for the luma component.
Possibility that only one transform dimension is skipped (in that case, only 1D scan), or skipped entirely
All block sizes including non-square are considered. 577 puts some restrictions e.g. only 4x4 blocks
Large increase in encoding time (all possible modes in RQT are used)
Average gains: Without class F 0.3/0.6/0.7% for RA/LD B LD P, with class F 0.6/1.3/1.6%
Current implementation does not give gain for intra
Highest gain for class F (screen content)
When skipping imposed on 4x4 only (as from 577), gain is reduced (0.1/0.2/0.3 without class F), 2D
transform used on 55%, other 45% are some version of skipping (only 8% 2D skips)
Somehow the 1D skip cases would be like NSQT with 1xN and Nx1, however the situation is different in
that there is no dependency and all 1D transforms can be performed in parallel.
Entropy coding is unchanged (same contexts as for conventional 2D transform)
Scale factors adjusted for quantization
Results for flat quantization; would it be possible to apply the new default 2D quantization matrix? This
was not investigated and requires further study

JCTVC-G894 Cross check for G575 K. Panusopone [late]

JCTVC-G908 Cross-check of BBC’s transform skip mode (JCTVC-G575) [J. Sole


(Qualcomm)] [late]

JCTVC-G933 Cross-check for skipped transform from BBC (JCTVC-G575) by Samsung


[I.-K. Kim, E. Alshina, J.H. Park (Samsung)] [late]

Page: 171 Date Saved: 2011-12-04


JCTVC-G577 Transform skipping dependant on block parameters [Glenn Van
Wallendael, Jan De Cock, Rik Van de Walle (Ghent Univ.), Marta Mrak
(BBC)]
In this contribution a mechanism for selection of transform skip mode (TSM) depending on current block
parameters is presented. TSM was introduced in JCTVC-F077 and it enables skipping transform on a row
and/or a column of motion compensated residuals. The transform skip mode choice is signalled to the
decoder where inverse transforms of rows/ columns are performed or skipped.
In this contribution, TSM is enabled or disabled depending on the size or the QP of the TU. Results are
presented when only applying TSM on TU sizes 4x4. Additionally TSM is restricted to QPs under a
threshold. These two scenarios offer an improved BD-rate performance and a lower encoder's execution
time compared to TSM that is applied on all block sizes or QPs. To support such more customized
selection of TSM, SPS signaling for TSM on larger blocks is proposed.
Was included in presentation of G575

JCTVC-G971 Crosscheck of BBC's Transform Skip proposal [H. Yang, X. Zheng, H. Yu]
[late]

JCTVC-G663 AHG19: Modification to HE transform coefficient coding for transform skip


mode [R. Joshi, J. Sole, X. Wang, M. Karczewicz (Qualcomm)]
Transform skip mode was proposed in JCTVC-F077. Instead of always applying a 2-D transform to the
prediction residual, transform in the row or column direction can be skipped. JCTVC-F077 did not
propose changes to transform coefficient coding. In this contribution, we propose changes to the coding
of the last non-zero transform coefficient and coding of significance flags. Compared to HM 4.0, BD-
rates of -0.6% (without class F) and -1.4% (with class F) are observed. The proposed modifications to HE
transform coefficient coding for transform skip modes contribute BD-rates of -0.1% (without class F) and
-0.2% (with class F).
Unlike the original version of G575, entropy coding is changed which could be an additional burden
In G575 the entropy coder is agnostic about the transform mode, therefore context adaptation is not really
working.
Additional gain over the other method without class F 0.1% over all cases, with class F 0.1/0.2/0.2

JCTVC-G941 AHG19: Cross-check of JCTVC-G663, modifications to HE transform


coefficient coding for transform skip mode [R. Cohen (MERL)] [late]

JCTVC-G945 Cross-check of JCTVC-G663, modifications to HE transform coefficient


coding for transform skip mode [Andrea Gabriellini, Marta Mrak (BBC)]
[late]

JCTVC-G586 Parallelizable context for significance coding of large transform blocks [J.
Kang, J. Lainema, A. Hallapuro, K. Ugur (Nokia)]

Page: 172 Date Saved: 2011-12-04


JCTVC-G1003 Integration of proposals on transform skipping [M. Mrak, A. Gabriellini,
D. Flynn (BBC), G. Van Wallendael (Ghent University), R. Joshi, J. Sole, X.
Wang, M. Karczewicz (Qualcomm)] [late]
This contribution presents integration of 3 proposals from AHG19 on transform skipping. Transform
skipping by omitting the transform on rows and/or columns is defined by four modes (including 2D
transform). The proposals from AHG19 addressed entropy coding tools to support transform skipping,
modified quantization to support different signal levels when the transform is skipped and proposed
configurations for coding with transform skipping dependant on block parameters. The proposals have
been integrated and the results for the unified solution are tested with transform skipping on 4x4 blocks.
Combination of G575, G577, G663: only 4x4 blocks
Gain without class F: 0.4/0.7/0.9%
Should there be a flag for disabling in SPS?
Investigate in new CE on transform skipping:
- Is it possible to achieve gain with as minimum changes as possible? (e.g. only 4x4 blocks, only
changing the transform to variants 2D, 1D-H, 1D-V, NO, without changing quantization, entropy
coding etc.
- The result of the CE should give information that allows judgement whether the amount of necessary
changes is justified by the gain
- Harmonization with quant matrices?

JCTVC-G1007 Crosscheck of JCTVC-G1003 [T. Davies] [late]

5.11.4 Other

JCTVC-G669 Delayed Duplicate I-Frame for Video Conferencing [R. Srinivasan, C.


Ghone, M. Mody, M. Zhou (TI)]
This contribution presents a technique to reduce the peak data bandwidth on constant bandwidth network
(resulting packet loss) because of I/IDR-Frames in a video-conferencing use-case or other real-time
multimedia exchange use-cases. Delayed Duplicate I-Frame (DDI-Frame) is an error resiliency tool to
enable reliable transmission on constant bit-rate network for bursty IDR/I frame traffic. DDI-Frame is
identical to I/IDR-Frame except having additional P-Frame generated from the same input frame. The
DDI bit-stream is not sent real-time, unlike P frame which is sent in real time to the decoder end. Bit-
stream will have later reference (after N frames) to the DDI-Frame. This relaxes the transmission time
requirement by factor of N for given DDI frame and smooth out bit-rate peaks which essentially cause
latency. With this modification, it is claimed that peak bit-rate requirement because of I-Frame reduces by
more than 3 times with a marginal increase in computation and bit-rate. The measured DDI-frame bit-rate
overhead is around 3% for an I-frame period of 100 frames in simulation, the real overhead should be
much smaller because the I-frame period used in the application is 3600 frames (or every 2 minutes). The
proposed change is applied for low-delay LC as well as HE cases and not applicable for Random-Access
and All-Intra cases.
DDI is like a long-term reference frame in DPB buffer
Should be discussed in context of error resilience activities and picture buffering
Relation with “pictures not output for display” in AVC? Relation with SP/SI?
Is there a normative decoder behaviour (beyond the DPB)? Would be better to leave it as non-normative
“concealment” when a loss occurs

Page: 173 Date Saved: 2011-12-04


Further study

JCTVC-G795 Crosscheck of TI’s JCTVC-G669 on delay dependent intra frame for video
conferencing [J. Min, Y. Piao, J.H. Park (Samsung)] [late]

5.12 Entropy coding

5.12.1 CAVLC

JCTVC-G247 Non-CE5: CAVLC counters normalization per LCU [E. François, S. Pautet,
C. Gisquet (Canon)]
No need to review

JCTVC-G405 Non-CE5: Cross-check of CAVLC counters normalization per LCU


(JCTVC-G247) [B. Li (USTC), J. Xu (Microsoft)]

JCTVC-G312 Non-Square Partition Mode Grouping for CAVLC [T. Yamamoto (Sharp)]
No need to review

JCTVC-G743 Cross-check report on Non-Square Partition Mode Grouping for CAVLC


(JCTVC-G312) [I.-K. Kim (Samsung)]

JCTVC-G355 Non-CE5: joint coding of splitting flag and inter modes [W. Zhang, P. Wu
(ZTE)]
No need to review

JCTVC-G741 Cross-check report on joint coding of splitting flag and inter modes
(JCTVC-G355) [I.-K. Kim (Samsung)] [late]

JCTVC-G905 Crosscheck report of ZTE's joint coding of splitting flag and inter modes
(JCTVC-G355) [X. Wang (Qualcomm)] [late]

JCTVC-G365 Non-CE5: Redefined contexts for last nonzero coefficient coding of 4x4 TU
in CAVLC [J. Xu, A. Tabatabai (Sony)]
No need to review

JCTVC-G626 Cross-check of JCTVC-G365, "Nonzero coefficient coding in CAVLC" [A.


Fuldseth (Cisco)] [late]

Page: 174 Date Saved: 2011-12-04


JCTVC-G692 Intra Table for CAVLC [S.-H. Kim, A. Segall] [late]
No need to review

5.12.2 CABAC

JCTVC-G829 Context modeling of split flag for CABAC [W. -J. Chien, M. Karczewicz]
Is discussed in context of CE1 BoG – done (see under G1022)

JCTVC-G849 Non-CE1: Crosscheck for Qualcomm's context modeling of split flag for
CABAC in JCTVC-G829 [T.-D. Chuang, Y.-W. Huang (MediaTek)] [late]

JCTVC-G326 Non-CE1.b: On the exponential memory decay probability update [J. Sole,
M. Karczewicz (Qualcomm)]
Modifications to the memory decay function for probability update in CABAC as presented in JCTVC-
F254 are proposed in order to have a symmetric estimator that disallows highly skewed distributions.
Further modifications include the replacement of the range multiplication by a series of shifts, thus
largely reducing the size of the LPS table. The BD-rates for AI-HE, RA-HE and LB-HE configurations
are 0.50%, -0.61%, and -0.55%, respectively.
Relates to G764
Much smaller LPS table (factor 288)
The table to be added would increase the current HEVC table size by factor 1.5
However, it is necessary to perform 5 shifts and 3 additions (could be replaced by multiply and one shift).
The added complexity would not justify adoption.
Combination with multi-parameter would be possible (as in CE1b) and increase performance but likely
improve table size by 5.

JCTVC-G974 Cross-check for JCTVC-G326: On the exponential memory decay


probability update [Gergely Korodi, Jinwen Zan, Dake He] [late]

JCTVC-G413 Modified probability update and table removal for multi-parameter CABAC
update (F254) [C. Rosewarne, M. Maeda (Canon)]
This contribution presents a method of updating probability estimates of a context model that enables the
substitution of the look-up table required for determining an offset for the range update with a function.
The intention of removing the look-up table is to reduce area for hardware implementations. The intention
of the modification to the probability estimate updating method is to remove one operation from the
substituted function, which we assert provides a timing benefit for hardware implementation. When
implemented on the multi-parameter version of JCTVC-F254 integrated in HM-4.0, the proposed
technique has a 0.01% increase in IA-HE, a 0.02% increase in RA-HE, a 0.00% effect on LB-HE and a
0.01% increase in LP-HE.
Restrict range of probability update operation.
Bitrate reduction 0.9%, 0.9%, 0.8% and 0.4% for AI, RA, LDB and LDP

Page: 175 Date Saved: 2011-12-04


Similar to G764 without table lookup and probability update.
Probability of bin 0 is kept in memory (not of LPS)

JCTVC-G805 Cross-check of Canon’s modified probability update and table removal for
multi-parameter CABAC update (JCTVC-G413) [J. Sole (Qualcomm)] [late]

JCTVC-G547 Counter-based probability model update with adapted arithmetic coding


engine [J. Stegemann, H. Kirchhoffer, D. Marpe, T. Wiegand (Fraunhofer
HHI)]
This contribution presents a counter based probability model update algorithm for the usage with an
adapted version of the M Coder of HEVC. The proposed method is closely related to Samsung’s Multi-
Parameter probability update proposal (F254) in conjunction with a concise design of the arithmetic
coding engine but retaining the R-D performance improvements of employing sophisticated probability
estimation. Gains in coding efficiency amount from 0.5 to 0.8%.
Numbers are for single-parameter and multi-parameter update (as CE1b)
Comment: Single estimator would be more interesting in terms of complexity tradeoff.
Comment: Improved initialization may reduce the compression gain
Establish/continue CE1b from 764, 326, 413, 547.

JCTVC-G470 Non-CE1: Cross-check on HHI's proposal (JCTVC-G547) [K. Sugimoto, A.


Minezawa, S. Sekiguchi (Mitsubishi)] [late]

JCTVC-G440 Non-CE1: Modified probability model update for complexity reduction [A.
Tanizawa, T. Shiodera, T. Yamakage (Toshiba)]
This contribution presents a technique to reduce complexity for CABAC probability model update
process. In this contribution, the coding efficiency of probability model update for several syntax
elements was evaluated, and a recommended combination is proposed.
Experimental result by disabling of probability model update for six syntax elements shows less than
0.1% BD-rate performance changes for four (IO/RA/LB/LP) HE conditions.
One comment: Invoking probability process depending on check of the given syntax element may be even
more complex than always doing it.

JCTVC-G833 Non-CE1: Adaptive initialization for CABAC with fixed probability contexts
[L. Guo, J. Sole, X. Wang, M. Karczewicz (Qualcomm)]
This contribution describes an adaptive initialization method for CABAC with fixed probability contexts.
At the beginning of each slice, a context can be initialized in one of two ways: 1) initialized using the pre-
defined (m,n) value (i.e., original HM method), or 2) initialized using a state value selected by the
encoder. For each context, the encoder signals the selection of initialization method to the decoder as well
as the new state value (if the second method is selected). No probability/state update will be performed in
the encoding/decoding process, and thus the proposed method is friendly to possible parallel-processing
applications. Experiments in HE compared to CABAC with context update show a BD rate change of
0.3%, 2.4%, 2.8%, 2.6% for AI, RA, LD and LDP, respectively. If combined with a new set of (m,n) that

Page: 176 Date Saved: 2011-12-04


are more suitable for fixed probability contexts, the BD rate change of 0.1%, 1.0%, 1.3%, 1.3% for AI,
RA, LD and LDP, respectively
Explicit signaling of probability state (derived by decoder) instead of update.
Losses compared to HE (fully adaptive) configs, but gains compared to “old” LC with CAVLC
Study in CE.

JCTVC-G909 Non-CE1: Cross-check of JCTVC-G833 on adaptive initialization for


CABAC [C. Yeo (I2R)] [late]

JCTVC-G848 Non-CE1: Crosscheck for Qualcomm's context reduction for CABAC in


JCTVC-G718 [T.-D. Chuang, Y.-W. Huang (MediaTek)] [late]

JCTVC-G593 Non-CE13: Simplification of merge mode [O. Bici, J. Lainema, K. Ugur


(Nokia)]
Has one aspect of entropy coding similar to G718: Context reduction for merge index coding (other
aspects were presented in MV coding category). Two contexts for bin 0, other 3 bins are bypass coded.
Gives 0.1% BR reduction in case of LD P.
It is reported that another version was tested where only 1 context bin 0 was bypass coded – then no gain
(was not cross-checked) – would be preferred version, CE or BOG (as G718) ; G718 uses one context for
all 4 bins.

JCTVC-G940 Cross-check for JCTVC-593: Non-CE13: Simplification of merge mode


[Semih Esenlik (Panasonic)] [late]

JCTVC-G324 Modified LPS range and state transition tables for BAC J. Sole, M.
Karczewicz (Qualcomm)
The proposal modifies the binary arithmetic coder to allow a slower probability adaptation process and
the coding of more skewed binary distributions than HM4.0. The entries of the range LPS table are
changed and the number of probability states increased from 64 to 128. BD-rate for AI-HE, RA-HE and
LB-HE configuration is -0.34%, -0.26%, and -0.05%, respectively. When the range LPS table size is
divided by 2, the BD-rates are -0.26%, -0.21%, and 0.07% for AI-HE, RA-HE and LB-HE, respectively.
A third variant using bit-shift operations that reduces the table size by 25% provides BD-rates of -0.36%,
-0.28%, and -0.13%.
Results are better for larger resolutions
Interest expressed: Versions of full and half table size
Investigate in CE
With proper adjusted initialization, results might even be better (include in CE plan)

Page: 177 Date Saved: 2011-12-04


JCTVC-G984 Cross check result of Modified LPS range and state transition tables for
BAC by Qualcomm (G324) [C. Rosewarne, M. Maeda (Canon)] [late]

JCTVC-G999 Verification of G324 on CABAC state machine [F. Bossen] [late]

JCTVC-G492 Maximum VLC Limits in CABAC Escape Coding [K. Sharman, J. Gamei,
N. Saunders, P. Silcock (Sony)]
This contribution presents two alternative sets of g_auiGoRicePrefixLen and g_auiGoRiceRange values,
which are used for CABAC escape codes. Possible inconsistencies and omissions in and between WD4
text and HM4.0 source code were also presented.
There is certainly some inconsistency.
Decision: adopt the following version (relative to the current software):
const UInt g_auiGoRicePrefixLen[4] = {8, 10, 10, 8}; (different from the 4.1 software in 3rd
element)
const UInt g_auiGoRiceRange[4] = {7, 20, 42, 70}; (already in the 4.1 software)
G700 was pointing out the same issue.

JCTVC-G493 CABAC Stream Termination [K. Sharman, J. Gamei, N. Saunders, P.


Silcock (Sony)]
A technique to terminate a CABAC stream is presented, and it is stated that the method produces a loss of
on average 1.5 bits. A second alternative method is also presented which is claimed to produce an average
loss of 1 bit. Applications of these techniques have been suggested to include termination of the CABAC
stream prior to IPCM data, and termination of the stream for row-per-slice. In the working-draft HEVC
design (JCT-VC F803), use of the CABAC stream termination method causes 8 bits to be flushed to the
stream. The technique is illustrated with an example where intra frames are terminated after each LCU,
allowing the coefficient bypass data (sign bits/escape codes) to be placed into the bit-stream in a raw
format.
Seems to target some very specific cases (including IPCM where saving some bits seems to be marginal)
Some other syntax elements are in fact coded that consume additional bits
In common conditions, saving should be unnoticeable

JCTVC-G494 CABAC Packet-based Stream [K. Sharman, J. Gamei, N. Saunders, P.


Silcock (Sony)]
A method of arranging CABAC data and bypass data into packets is presented, where the bypass data
comprises equal-probability bits. This method is stated to allow the insertion of bypass data into the bit-
stream in raw binary form, which then allows the bypass data to be read and decoded in parallel with
CABAC data. The reported cost of this method is an average of 8 bits per slice with no effect on software
encoding and decoding times.
Interesting, somehow similar idea as of PIPE, but how is multiplexing/demultiplexing done? Making this
normative would be undesirable as it somehow would prescribe the way of implementation of the
decoder; the encoder might be complicated.
No action.

Page: 178 Date Saved: 2011-12-04


JCTVC-G963 Cross verification of CABAC packet based stream (JCTVC-G494) [M.
Coban (Qualcomm)] [late]

JCTVC-G501 CABAC with Arithmetic Context Variables [K. Sharman, J. Gamei, N.


Saunders, P. Silcock (Sony)]
This proposal reports on a technique to remove the look-up tables within the CABAC engine, replacing
them with arithmetic operations. Claims are made that this could potentially reduce hardware size and
complexity even if the Context Variables used are all increased from 7 to 8-bit. Results are reported for
two different variants of the system, showing average {Y, U, V} BD-rate changes of {0.0%, 1.1%, 1.1%}
and {0.0%, -0.9%, and -1.0%} respectively; encoder and decoder processing times were 101% and 100%
respectively for both systems.
Two methods: One multiplier-based, one adder-based.
Increase of complexity of CABAC only? Some runtime numbers are given, but in software, the update is
still implemented as table.
General hint: Whenever a change is made to the core of the CABAC engine, a more in-depth study should
be performed on complexity impact (software, hardware).

JCTVC-G504 CABAC with Combined Context Variables [K. Sharman, J. Gamei, N.


Saunders, P. Silcock (Sony)]
This proposal presents an adaption of the CABAC entropy coder that is claimed to be able to decode
multiple coefficients per clock and therefore capable of decoding even the sustained worst case for 4:4:4
data.
The reported method encodes or decodes all of the required CABAC context variables (CV) for each
coefficient in a single cycle using a quaternary arithmetic coding function; the bypass (sign and escape
codes) are not explicitly discussed in this report, although a method whereby the bypass data can be
decoded in parallel to the CV CABAC data is assumed, such as claimed in JCTVC-G494 “CABAC
Packet-based Stream”.
The reported method encodes or decodes coefficients in a single pass of the data. Problems regarding the
speculation required to maintain the throughput are discussed, and it is indicated that with changes to the
CV selection system, speculation for large blocks (16x16, 32x32) can be completely removed. For small
blocks (4x4 and 8x8), two methods of CV selection are presented that reduce speculation to 2 levels and 3
levels respectively (that is, a system with 2 levels of speculation would prepare 2 possible values per
coefficient, requiring less than 2x the amount of logic than that of a system with no speculation). Results
are presented that indicate average BD-rate changes of +0.1%, -1.1% and -1.2% for Y, U and V
respectively for the 3x speculation system, and average BD-rate changes of +0.2%, -1.0% and -1.1% for
Y, U and V respectively for the 2x speculation system. In both systems, the context variables have not
been re-optimized/adjusted from the values in HM4.0.
In addition, results are presented for an unoptimized packet-based system with 2 levels of speculation that
indicates average BD-rate changes of 0.2%, -1.0% and -1.1% for Y, U and V respectively; the software
times have increased by 3% and 0.5% for the encoder and decoder, with increases believed to be caused
by the deferred ACV update mechanism and the neighbourhood-based method to select context variables
ACV1 and ACV2, both designed to reduce the speculation requirements in hardware.
Further Study in AHG on throughput of coefficient entropy coding (still to be decided whether we will
have this)?.

Page: 179 Date Saved: 2011-12-04


JCTVC-G934 Non CE1: cross-check for CABAC probability up-date modification from
Sony (G501, 504) by Samsung [A. Alshin, E. Alshina, J.H. Park (Samsung)]
[late]

JCTVC-G570 Byte alignment overhead reduction [S. Esenlik, M. Narroschke, T. Wedi


(Panasonic)]
In the current HM4.0 byte alignment is used in various places in order to facilitate several purposes such
as Emulation Prevention Byte (EPB) insertion and alignment of the bitstream in the encoder and decoder.
The proposal introduces a coding method for the coding of syntax elements such that byte alignment
condition is achieved automatically and the need for an explicit byte alignment function is eliminated. As
a result the 0.1% bit rate reduction is achieved on average.
No support for adoption.

JCTVC-G972 Cross-verification of Panasonic’s proposal JCTVC-G570 on byte alignment


overhead reduction [O. Bici, J. Lainema, K. Ugur (Nokia)] [late]
Cross checker could see some advantage, but is neutral in terms of adoption

JCTVC-G716 On CABAC Init IDC [K. Misra, A. Segall (Sharp)] [late]


This contribution proposes modification of the cabac_init_idc indicator that is associated with CABAC
initialization. In HEVC, cabac_init_idc is carried over from the H.264/AVC design but not well defined.
Specifically, the indicator appears in the HEVC WD but not in the HM software. This document
proposes to define the meaning of this indicator. Compared to H.264/AVC, the proposed approach
results in fewer tables available for CABAC initialization (reduced from seven to three). Additionally,
the proposed approach allows different slice types to use the same table. Results show that the method
provides improvements in coding efficiency. It is asserted that the proposal provides clarification to the
HEVC design, simplification relative to H.264/AVC and coding efficiency improvement.
Definitely needs cleanup
Include in side activity on 8-bit initialization.

JCTVC-G983 Cross-check Sharp's contribution On CABAC Init IDC (JCTVC-G716) by


Qualcomm [L. Guo] [late]

JCTVC-G837 Non-CE1: 8-bit Initialization for CABAC [L. Guo, R. Joshi, J. Sole, X.
Wang, M. Karczewicz (Qualcomm)]
This contribution describes an 8-bit initialization method for CABAC. The 8-bit m (slope) and 8-bit n
(intersection) are replaced by a single 8-bit InitIdx for each context. SlopeTable (16 elements) and
IntersecTable (16 elements) are introduced to convert the 8-bit InitIdx to a CABAC probability state in
the initialization stage. Two sets of table values are presented in this contribution. For both of these two
sets, the table look-up operations can be implemented using formula calculation and thus the table storage

Page: 180 Date Saved: 2011-12-04


is not necessary. The average BD-rate reduction is 0.0%/0.3%/0.5% /0.3% for AI/RA/LD/LDP (HE) for
the first set of tables, and 0.0%/0.2%/0.4% /0.3% for AI/RA/LD/LDP (HE) for the second set of tables.
Current method of initialization is certainly not good (e.g. range of n values useless)
Training was performed on class D (where the gain is a little bit larger)
8 bit initialization is seen as relevant
Another method was used in CE1 where the mapping is not linear (more staircase)

JCTVC-G913 Cross-check for Qualcomm’s proposal on 8-bit initialization for CABAC


(JCTVC-G837) [Y. Piao, J. Min, J.H. Park (Samsung)] [late]

JCTVC-G155 Non-CE1: On CABAC context initialization [C. Yeo, Y. H. Tan, Z. Li (I2R)]


This contribution presents an implementation of the 8-bit CABAC context initialization proposed by HHI
in JCTVC-F268. In this proposal, the initial CABAC context variable is directly computed using an 8 bit
initialization value instead of first computing a PIPE index and its internal sub-state before converting
that to a CABAC state. Coding results on HM-4.0 reportedly show an average Luma BD-Rate of 0.0% for
each of the AI-HE, RA-HE and LB-HE configurations.
The presentation deck also contains (slide 7) some issues observed with unused contexts in WD (editors
to check)
Conclusions:
Further investigate / continue CE?
8-bit context initialization is something we definitely will have in the CD, and should have it already in
the next WD
Proper initialization could also be relevant in context of other decisions such as context reduction and
probability.
Side activity (B. Bross, proponents of G837, G155, G716): Report back with a proposal on a reasonable
method for 8-bit initialization (representation and init values)
- No consensus
- Both methods (G633 and G837/G155) would fulfill the purpose
- In the CE (G633), a piecewise linear approximation was tested. This method has been available for
study for sufficient amount of time, whereas the other is a new contribution
- The method investigated in the CE was originally designed to support both PIPE and CABAC, the
original code is not in best shape; now a cleaned-up version exists which has not yet been inspected.
- Conclusion: The cleaned-up version of G633 will be inspected by the proponents of G837/G155. If it
is bit-wise exact, it will adopted, other the method of G837/G155 will be adopted. The CE will be
continued to further improve or replace the adopted method; in the context of the CE, the algorithms
for training of the respective representations(including the method of WD/HM) will be exchanged.
- Topic of G716 shall stay part of this effort.
It was confirmed (Mon 28 morning) that the cleaned-up version exactly reproduces the results that were
reported in G633 in CE1c.
Decision: Adopt the cleaned-up version of G633, still to be done: Cleanup of WD text, to be checked by
the investigators of software. Will be uploaded as revision of G633.

Page: 181 Date Saved: 2011-12-04


Investigation on improvement of 8-bit initialization will continue in CE1. In that context, inclusion of
G716 will also be investigated.

JCTVC-G867 Non-CE1: Crosscheck of I2R's CABAC Context Initialization (JCTVC-


G155) by Qualcomm [L. Guo (Qualcomm)] [late]

5.12.3 Other

JCTVC-G064 Results of the PA-Coder [H. Zhu]


The proposal reports the design, fast decoding algorithm, probability estimation and coding results
of the arithmetic coder based on the probability aggregation.
Presentation not uploaded.
Word file hardly allows to understand the method and has some void chapters.
0.2% Y bit rate increase for all HE test cases.
Contribution noted.

JCTVC-G452 SABAC - Scalable-complexity Adaptive Binary Arithmetic Coding [K.


Sugimoto, A. Minezawa, S. Sekiguchi (Mitsubishi)]
This contribution proposes a mechanism to realize scalable complexity entropy coding architecture by
signaling flags at slice header to disable some parts of functionalities of CABAC process.
SABAC enable and fast mode flags: When set “on”, context derivation and probaility updates are
disabled.
This method would just help a realtime encoder which is not powerful enough.
Comments: The main problem is the complexity at the decoder side. Turning updates off does not help
entirely, as this may not be the main bottleneck. Such a method could end up in designing two different
engines.
No action

JCTVC-G659 Non CE1: Study of Entropy Coding Methods Complexity [M. Karczewicz,
I.S. Chong, X. Wang, R. Joshi (Qualcomm)]
This document summarizes the key issues in hardware implementation of CABAC and CAVLC
decoding. Throughput numbers obtained for these method for H.264/AVC are quoted. Statistics of
number of symbols and bins decoded per frame for the current test conditions are given.
Presentation not uploaded.
Informative contribution. Claimed that there is still reason to reduce throughput in CABAC

JCTVC-G568 HM 4.0 entropy coding complexity study and software improvements [M.
Viitanen, J. Vanne, T. D. Hämäläinen (TUT), J. Lainema, K. Ugur (Nokia)]
This contribution presents a software complexity analysis of CAVLC and CABAC entropy decoders. The
contribution also identifies some obsolete code as well as possibilities to avoid division operations in HM
software and proposes to clean up those in the next HM release. It is reported that the HM4 CABAC

Page: 182 Date Saved: 2011-12-04


decoding is on average 53%, 28%, 8%, and 10% more complex than CAVLC decoding in the studied
software environment for AI_LC, RA_LC, LB_LC, and LP_LC configurations, respectively. It is further
reported that the relative burden of CABAC decoding gets more significant when operating at higher
bitrates.
(was already partially considered during CE1 review)
Presentation not uploaded.
Gap between CABAC and CAVLC (based on cycle count of HM) tends to become larger towards higher
rates (CABAC roughly 50% more cycles than CAVLC at QP22).

JCTVC-G569 Single entropy coder for HEVC with a high throughput binarization mode
[J. Lainema, K. Ugur, A. Hallapuro (Nokia)]
This contribution presents a single entropy coding architecture for HEVC. The proposed architecture
contains current HM 4.0 CABAC engine in its entirety. In addition it contains a high throughput
binarization (HTB) mode for transform coefficient coding that can be enabled when maximum throughput
is desirable. The high throughput binarization is identical to the current HM 4.0 CAVLC coefficient
coding where CAVLC codewords are fed to the bypass coding of CABAC. The intention of this approach
is to significantly reduce the worst-case complexity of CABAC, making it more suitable for low
complexity use cases. It is reported that the proposed method reduces the number of context adaptively
coded bins by 61 % under the common test conditions, and by 88 % in the low QP range (QP 2 to 17)
where the complexity of CABAC is reportedly more problematic. It is further reported that the proposed
approach can improve objective coding efficiency of low complexity configurations by -1.1 % (AI_LC), -
2.1 % (RA_LC), -2.7 % (LB_LC) and -3.1 % (LP_LC), while high efficiency results stay unaffected
when utilizing the existing high efficiency CABAC binarization.
Main problem in throughput: Coefficients
Suggest to have two different binarizations: High efficiency mode and high throughput mode
In high throughput mode, it is suggested to use CABAC binarization for motion/mode/cbf and CAVLC
binarization for coefficients
Uses only 30% of previous CAVLC syntax elements, and 15% of code tables.
How much faster did the software run for extreme low QP? Not analysed
Amount of CABAC bypass bins at lower QP?

JCTVC-G1008 Cross-check of JCTVC-G569 [T. Davies] [late]


(1008 is cross-check but without thorough study of the software)
Conclusions:
- There is no doubt (as said in the context of CE1) that reducing throughput is an issue
- The problem can be specifically targeted to transform coefficient coding
- The solution suggested here is not necessarily the most desirable design, as in principle it still means
implementing “roughly two” entropy coders (not “1.2”)
- Establish/continue CE and AHG for study of throughput particularly of TC coding
- Define a measure for measuring throughput – measure of bins/pixel may be one, but disputable
- Establish a HM5 software branch out of G569

Page: 183 Date Saved: 2011-12-04


5.13 Transform coefficient coding
G517 (late) discusses coefficient scans and should also be included in the discussion in this area.

5.13.1 Luma and chroma ordering

JCTVC-G112 Changing luma/chroma coefficient interleaving from CU to TU level [T.


Hellman, Y. Yu (Broadcom)]

JCTVC-G645 Crosscheck of JCTVC-G112 – Changing Luma/Chroma Coefficient


Interleaving from CU to TU level [M. Budagavi (TI)] [late]

JCTVC-G381 Nearest placement of Y/Cb/Cr transform coefficients locating at same spatial


position [Y. Shibahara, K. Uchibayashi, T. Nishi (Panasonic)]

JCTVC-G904 Cross Check of G381 [Ankur Saxena, Felix Fernandes (Samsung)] [late]

General

The main part of G112 and G381 is the same, relating to the order of syntax elements. The proposal was
asserted to provide a substantial hardware decoder complexity savings.
Decision: Adopted the common part of G112 and G381.
An aspect particular to G381 was noted, regarding the placement of chroma relative to luma in the
minimum transform size case. In this aspect, G381 was suggested as preferable.
Decision: Adopted this aspect of G381.
An aspect particular to G112 was noted, regarding interleaving of chroma and luma in IPCM mode.
Decision: Define MaxIPCMcuSize and require it to always be at most 32x32 (to provide text consistent
with ability to disable IPCM).
It was noted that G118 (avoiding sending other stuff between luma and chroma IPCM) has a relationship
with this, but seems non-conflicting in spirit.
Another aspect particular to G112 was behaviour with MinTrafoSize equal to MaxTrafoSize in regard to
chroma. Two possible solutions were suggested.
Decision: Establish MinChromaTrafoSize = (chroma_format  = =  4:4:4) ? MinTrafoSize :
( Max( MinTrafoSize − 1, 4x4 ) ).

5.13.2 Significance map coding

JCTVC-G308 Complexity reduction of significant map coding [T. Ikai (Sharp)]

Page: 184 Date Saved: 2011-12-04


JCTVC-G751 Cross-check report for Sharp's proposal (JCTVC-G308) on Complexity
reduction of significant map coding [H. Sasai, T. Nishi (Panasonic)] [late]

JCTVC-G366 Non-CE11: Context reduction of significance map coding with CABAC [C.
Auyeung, J. Xu (Sony)]

JCTVC-G127 Cross-check of Sony’s proposal on context reduction for significance map


(G366) [V. Sze (TI)]

JCTVC-G644 Multi-level Significant Maps for Large Transform Units [N. Nguyen, T. Ji,
D. He, G. Martin-Cocher, L. Song (RIM)]
The objective is to reduce the number of significant coefficient bins to be decoded for large transforms.
Averate bin count benefit was shown (3-4%), especially on larger block sizes (14% AI, 12% RA, 30%
LB).
Part of this contribution depends on G323.
When combined with G323, some coding efficiency benefit was reported (0.1−0.3%) and it was reported
that more use of the larger block sizes occurs.
The technique was also reported to work with other types of scans (although it fits best structurally with
the sub-block scan scheme).
The scheme seems particularly "clean" in conjunction with the 4x4 sub-block scan of G323.
Response is quite positive, at least when used in conjunction with G323.
Decision: Adopt (whichever variant, depending on G323 decision).

JCTVC-G1001 Cross-check of RIM's multi-level significant maps for large transform units
(JCTVC-G644) J. Sole [late]

JCTVC-G657 Encoding and decoding significant coefficient flags for small Transform
Units using partition sets [G. Korodi, J. Zan, D. He (RIM)]

JCTVC-G796 Cross-check for RIM’s proposal on small TU significance map coding


(JCTVC-G657) [Y. Piao, E. Alshina, J. Min, J.H. Park (Samsung)]

JCTVC-G725 Non-CE11: Simplification of significant coding for CABAC [V. Kung, K.


Panusopone (Motorola Mobility)]

Page: 185 Date Saved: 2011-12-04


JCTVC-G876 Cross-check of Motorola's proposal (JCTVC-G725) - Simplification of
significant map coding for CABAC [H. Yang, H. Yu (Huawei)] [late]

JCTVC-G768 Reduced contexts for significance map coding of large transform in CABAC
[Y. Piao, J. Min, J. H. Park (Samsung)]

JCTVC-G642 Cross-check of Samsung context reduction in significance map coding [J,


Zan, D. He] [late]

JCTVC-G781 Reduced chroma contexts for significance map coding in CABAC [Y. Piao,
J. Min, E. Alshina, J. H. Park (Samsung)]

JCTVC-G917 On significance map coding for CABAC [V. Sze (TI)] [late]

JCTVC-G981 Crosscheck of JCTVC-G917 on significance map coding in CABAC from TI


[C. Auyeung (Sony)] [late]

JCTVC-G986 Fast algorithm and some comments on the significance map coding [H. Zhu]
[late]
Information document (late) – contributor not available – not presented.

JCTVC-G1015 A combined proposal from JCTVC-G366, JCTVC-G657, and JCTVC-


G768 on context reduction of significance map coding with CABAC [C.
Auyeung, J. Xu (Sony), G. Korodi, J. Zan, D. He (RIM), Y. Piao, E. Alshina,
J. Min, J. Park (Samsung)] [late]

Context selection

[include summary from V. Sze]


For 16x16 and 32x32, currently there is diagonal wavefront-ish scanning, and 44 contexts for luma, and
44 for chroma.
Common theme: Sharing contexts for higher frequencies for 4x4 and 8x8. Generally there is no coding
loss from these various proposals.
A new context initialization (Track A) has approximately 0.3% benefit, from improving the probability
estimates of the values.
Testing with the new context initialization may be necessary.

Page: 186 Date Saved: 2011-12-04


Comment: We hope that these have been tested with low QP. Low QP is generally more sensitive to
context reduction and less sensitive to initialization.
The CE did not test significance map proposals.
It was remarked that context selection logic may typically be more important than quantity of contexts.
It was remarked that the interaction with the level coding is also important.
G1015 is a new combined proposal that has been tested on low QP. It was remarked that G1015 removes
about 40 contexts for the significance map (about half of the current total for that purpose, which is
roughly 10% of the global total). It was suggested that this is a substantial amount, so it seems clear that
some action is needed.
Regarding impact on software, it seems to change only one function. Text is available for G1015.
In the interest of trying to have a solution rapidly, G1015.
Decision: Adopt G1015 (pending appropriate text availability – revisitprovided in -v3 of document –
seems generally acceptable).
The other significance map proposals are for further study in a CE (G308, G725, G448, G781).

Context selection

G917 simplification of context selection for high-frequency positions in order to increase parallelism
This was asserted to enable up to 5 bins per cycle of parallelism.
Some skepticism was expressed about the need for this; since it was a late document, it was suggested to
defer its consideration to further study.

5.13.3 Last coefficient flags (x and y)

JCTVC-G201 Non-CE1: Codeword reordering for last_significant_coeff_x and


last_significant_coeff_y [T.-D. Chuang, C.-Y. Chen, Y.-W. Huang, S. Lei
(MediaTek)]

JCTVC-G761 Cross-check report for MediaTek's proposal (JCTVC-G201) [H. Sasai, T.


Nishi (Panasonic)] [late]

JCTVC-G239 Non-CE11: Modified method for coding the positions of last significant
coefficients in the CABAC mode [S.-T. Hsiang, S. Lei (MediaTek)]

JCTVC-G947 Crosscheck for Modified method for coding the positions of last significant
coefficients in the CABAC mode (JCTVC-G239) [G. Clare, F. Henry
(Orange FT)] [late]

Page: 187 Date Saved: 2011-12-04


JCTVC-G370 Binarization of last position for higher throughput [C. Auyeung (Sony)]

JCTVC-G900 Cross-check of binarisation of last position for higher throughput (G370) [V.
Drugeon, M. Narroschke (Panasonic)] [late]

JCTVC-G520 Non-CE11: Modified Context Derivation for last coefficient flag [H. Sasai, T.
Nishi (Panasonic)]

JCTVC-G942 Non-CE11: Cross-check of Modified Context Derivation for last coefficient


flag (JCTVC-G520) [T. Tsukuba, T. Ikai (Sharp)] [late]

JCTVC-G554 Grouping of bypass bins for last position coding of transform coefficients [I.-
K. Kim, V. Seregin, J. H. Park(Samsung)]

JCTVC-G704 Last position coding for CABAC [W.-J. Chien, J. Sole, M. Karczewicz
(Qualcomm)]

JCTVC-G873 Cross-check of JCTVC-G704 on last position coding for CABAC from


Qualcomm [Edouard François (Canon)] [late]

5.13.4 Coefficient level coding

JCTVC-G301 Test Results On Context simplification for coefficients entropy coding [X.
Che, W. Ding, Y. Shi (Beijing Univ. Tech.)]

JCTVC-G128 Cross-check of proposal on context simplification for coefficients entropy


coding G301 (F148) [V. Sze (TI)]

JCTVC-G989 Cross-check of JCTVC-G301 on the context simplification for coefficients


coding by Huawei [H. Yang (Huawei)] [late]

JCTVC-G448 Non-CE11: Context reduction for coding transform coefficients [S.-T.


Hsiang, S. Lei (MediaTek)]

Page: 188 Date Saved: 2011-12-04


JCTVC-G997 Non-CE11: Cross-check of JCTVC-G448 on context reduction for coding
transform coefficients V. Seregin, J. Sole (Qualcomm) [late]

JCTVC-G522 Non-CE11:Context reduction for coefficient level [K. Terada, H. Sasai, T.


Nishi (Panasonic)]

JCTVC-G126 Cross-check of Panasonic’s proposal on context reduction for coefficient


levels (G522) [V. Sze (TI)] [late]

JCTVC-G700 On coeff_abs_level_minus3 coding [J. Lou, L. Wang (Motorola Mobility)]


This proposes fixing the inconsistency also noted in G492. See notes for G492.
Additionally, a new updating rule was proposed as a regularity cleanup (and hardware implementation
simplification) for the current cRiceParam table.
If the table is implemented as combination logic, it has fewer cases to check.
The cross-checker commented that it would be beneficial to test this in the low QP range (e.g. using QP
range previously used for transform CEs). Decision: Adopt this aspect.
The adoption was initially c(conditioned on the result of this low QP range test degrading luma BD rate
less than 0.1% as a global average). A revision of G700 (-v9) was uploaded, which reported that the result
is actually an improvement on average (albeit small).
There is also a third element of the proposal, which was suggested to not need to be presented.

JCTVC-G854 Crosscheck – On coeff_abs_level_minus3 coding (G700) [T. Nguyen


(Fraunhofer HHI)] [late]

JCTVC-G783 Context number reduction for level coding in CABAC [Y. Piao, J. Min, J.H.
Park (Samsung)]

JCTVC-G125 Cross-check of proposal on context reduction for coefficient levels (G783) [V.
Sze (TI)] [late]

JCTVC-G718 Context reduction for CABAC [W.-J. Chien, J. Sole, M. Karczewicz


(Qualcomm)]
(First discussed in Track A.)
This proposal reduces the number of contexts in CABAC. The number is reduced by 56 contexts.
Experimental results reportedly show 0.00%, -0.01% and 0.12% BD-rate changes in high efficiency intra-
only, random access and low-delay test conditions respectively.
1st part relates to using same coding for P and B (see under G785)

Page: 189 Date Saved: 2011-12-04


2nd part relates to context number reduction (151 to 95)
Some relate to transform coefficient coding and CBF – how does it behave at low QP?
Some imbalance for the color components – clarify.
Very interesting, but major package of changes where it must be ensured that it is stable – investigation in
CE or before in BoG (if one is installed in context of transform coefficient coding in track B; aspects of
transform context modification should also be presented there).
CBF and coefficient coding part of the contribution were discussed in Track B. see discussion in CBF
section.

General

[include summary from V. Sze]


Regarding low QP testing, all but G522 have reportedly been tested for low QP. The contributor of G522
indicated that they have the results for low QP and will provide them. Cross checks for low QP may be
missing in some cases.
G121 was in CE11, and seems like a step in the right direction. It is generally compatible with the others.
Again, it seems clear that the benefit is substantial and some action will be needed.
Decision: Adopt G121.
Further study of the others in a CE.
It was suggested that the first part of G783 is simple, beneficial and non-conflicting with most others.
Decision: Adopted (the chroma-only part of G783 – pending availability of appropriate text – revisit that).

5.13.5 Scans

JCTVC-G226 Non-CE11: Extending MDCS to 16x16 and 32x32 TUs [C.-W. Hsu, X. Zhao,
X. Guo, Y.-W. Huang, S. Lei (MediaTek)]
Intended to improve coding efficiency by adding two more scans; measured impact was approximately
−0.1%. This gain seems insufficient.

JCTVC-G187 Non-CE11: Cross-verification of MediaTek's proposal (JCTVC-G226) by


JVC KENWOOD [S. Fukushima (JVC Kenwood)]

JCTVC-G285 Non-CE11: Methods for Solving the Parsing Issue of MDCS [X. Zhao, X.
Guo, C.-W. Hsu, T.-D. Chuang, Y.-W. Huang, S. Lei (MediaTek)]
Scans depend on intra mode; the suggestion is to remove that dependency. Similar loss (0.2%) with a CE
proposal on the subject. Asserted to be simpler than the CE proposal, which was not adopted. However,
the group did not think that the design had a significant problem that needed to be solved.

JCTVC-G531 Cross verification of MediaTek’s proposed methods for solving the parsing
issue of MDCS (JCTVC-G285) [Y. Chiu, L. Xu (Intel)] [late]

Page: 190 Date Saved: 2011-12-04


JCTVC-G323 Non-CE11: Diagonal sub-block scan for HE residual coding [J. Sole, R.
Joshi, M. Karczewicz (Qualcomm)]
Relates to G644 and G958. Suggests to break 32x32 and 16x16 into sub-blocks for scanning of
significance maps and coefficients. Modifies context selection to avoid a neighbour bottleneck.
Suggests to revert part of the design to pre-Torino behaviour.
It was remarked that the proposal would negatively affect the maximum parallelism for the coding of the
significance map. Some other participants expressed skepticism about whether the maximum parallelism
limitation was a significant issue in actual practical use.
There seems to be a question over which issue is more important – the sub-block handling or the
significance map coding parallelism.
There are other contributions that relate to this (G644 and G958).
Another aspect, proposed in G320, involves interleaving of 5 classes of data at the 16x16 level, which is a
small change that is subordinate to the decision on the sub-block scanning.
Decision: Adopt G323 / G320 (text to bewas provided during the meeting and considered acceptable).
Flags for sub-blocks of G644 are to be similarly interleaved.

JCTVC-G810 Crosscheck of JCTVC-G323 on sub-block scan for residual coding from


Qualcomm [C. Auyeung (Sony)] [late]

JCTVC-G957 Non-CE11: Cross-check result of Diagonal sub-block scan for HE residual


coding by Qualcomm (G323) [C. Rosewarne, M. Maeda] [late]

JCTVC-G1021 Non-CE11: Cross check result of JCVC-G323 proposed by Qualcomm [J.


Kim] [late]

JCTVC-G491 Hybrid Horizontal/Vertical with Diagonal Mode Dependent Coefficient Scan


[K. Sharman, J. Gamei, N. Saunders, P. Silcock (Sony)]
Not necessary for presentation.

JCTVC-G943 Non-CE7: Crosscheck for Sony's MDCS in Proposal JCTVC-G491 [X.


Zhao, X. Guo (MediaTek)] [late]

JCTVC-G958 Non-CE11: Modified context selection for significant coefficient flags with
diagonal sub-block scan [C. Rosewarne, M. Maeda] [late]
Depends on G323.
In G323, there are two locations where the current flag depends on the previous one. G323 addresses this
by excluding the previous flag from the context selection. Here this is dealt with by making a guess about
the probable value of the missing neighbour flag.

Page: 191 Date Saved: 2011-12-04


The reported gain was considered too small to justify the added complication.

JCTVC-G976 Cross-check of Canon modified context selection for significant coefficient


flags with diagonal sub-block scan (JCTVC-G958) [J. Sole] [late]

5.13.6 NSQT (scans and contexts)


G517 is also related.

JCTVC-G123 Non-CE11: Simplified Coefficient Scans for NSQT [V. Sze (TI)]

JCTVC-G758 Non-CE11: Cross-check report for TI's proposal (JCTVC-G123) on


Simplified Coefficient Scans for NSQT [H. Sasai, T. Nishi (Panasonic)] [late]

JCTVC-G724 Non-CE11: Entropy coding for non-square TU blocks [Vivian Kung, Krit
Panusopone (Motorola Mobility)]

JCTVC-G875 Cross-check of Motorola's proposal (JCTVC-G724) - entropy coding for


non-square blocks [H. Yang, H. Yu (Huawei)] [late]

General

Three aspects in G724 and G750:


 Scan
 Last
 Significance map

Suggestion to harmonize G323 and G1015 – how to do last significant coefficient coding.
Side work to do this in BoG (J. Sole) and revisit.

5.13.7 Sign coding

JCTVC-G271 Sign Data Hiding [G. Clare (Orange Labs), F. Henry (Orange Labs)]
Concept previously proposed in JCTVC-A114 and in JCTVC-E428.

Page: 192 Date Saved: 2011-12-04


Proposes to "hide" one sign bit into the parity of the sum of the transform coefficient levels between the
first and last non-zero coefficient.
Average 0.6−0.7% benefit reported.
Question: What if RDOQ is disabled? The encoder would need to be somewhat more complex if it wants
to use this technique effectively.
Syntax to send four thresholds (one each for luma and chroma, inter and intra).
Question: Is this sensitive to TU depth?
Question: What is the decoder complexity impact? It was suggested not to be difficult to support.
Question: Are there any visual artifacts?
Further study was highly encouraged (CE).

JCTVC-G889 Crosscheck of G271 on Sign Data Hiding [Andrea Gabriellini, Marta Mrak
(BBC)] [late]

JCTVC-G372 Coding order of sign and level minus 3 with CABAC [C. Auyeung, T. Suzuki
(Sony)]
Suggests to send the sign flag after the level_minus3 instead of before it, for improved throughput in a
hardware implementation. This approach was suggested to require less memory and have lower latency,
at least in some implementations.
In hardware it was suggested that the proposed approach may be preferable to enable earlier availability
of some of the transform coefficient values from the decoding process.
Another participant suggested that in another implementation (software-based), the change would not be
beneficial – that since the (16) sign bits can be stored with less memory than the level values, it is
preferable to store the sign bits than to store the level values. Also, the level_minus3 would have many
more bins to decode than the sign.
Support for adoption was only expressed by the proponent, and further study was recommended to
determine whether there is a significant problem with the current approach.

JCTVC-G744 Cross-check report on Coding order of sign and level minus 3 with CABAC
(JCTVC-G372) [I.-K. Kim (Samsung)]

5.13.8 CBF
See also notes for JCTVC-G718.

JCTVC-G444 Proposed fix on cbf flag signaling [A. Minezawa, K. Sugimoto, S. Sekiguchi
(Mitsubishi)]

Proposes to modify condition for inferring cbf flag. Identifies one additional case where cbf for luma can
be inferred.
Tried multiple alternate configurations for min/max CU and TU size settings.

Page: 193 Date Saved: 2011-12-04


Applies to common conditions and other conditions as well.

Cross-checked in G760.
Details of proposal were not easily understood. Proponents were asked to discuss the proposal with WD
editors and report back.
Revisit
This was discussed again on Nov. 29. B. Bross confirmed the correctness of the proposal which would
not worsen the quality of text.
Decision: Adopt.

JCTVC-G760 Cross-check report for Mitsubishi's proposal (JCTVC-G444) on Proposed


fix on cbf flag signaling [H. Sasai, T. Nishi (Panasonic)] [late]

Discussion

G718 proposes to share the CBF contexts for the U and V components in order to reduce the number of
CABAC contexts (saving 5).
The result is a small (less than 1%) improvement in coding efficiency of V component.
Decision: Adopt this aspect of G718.

5.13.9 CAVLC
Not reviewed due to decision to remove CAVLC.

JCTVC-G537 Table reduction and Improvement of last position coding in CAVLC [C.
Kim, Y. Park, J.H. Park(Samsung)]

JCTVC-G838 Cross-check of CAVLC last position coding (JCTVC-G537) [T. Yamamoto


(Sharp)] [late]

JCTVC-G545 Improvement of level coding in CAVLC [C. Kim, Y. Park, J.H.


Park(Samsung)]

JCTVC-G801 Cross-check for Samsung's proposal (JCTVC-G545) [J. Xu (Sony)] [late]

JCTVC-G685 Selective Run-Level Coding for CAVLC [S.-H. Kim, A. Segall (Sharp)]

Page: 194 Date Saved: 2011-12-04


5.13.10 Other (not yet addressed)

JCTVC-G202 Non-CE2: Modified NSQT coefficient scan for CAVLC [C.-W. Hsu, Y.-W.
Huang, S. Lei (MediaTek)]

JCTVC-G747 Non-CE2: Crosscheck of MediaTek's proposal on NSQT coefficient scan for


CAVLC (JCTVC-G202) [X. Zheng (HiSilicon)] [late]

JCTVC-G750 Non-CE2: Harmonization of HE residual coding and NSQT [J. Sole, X.


Wang, M. Karczewicz (Qualcomm), J. Kim, B. Jeon (LGE)]

JCTVC-G596 Crosscheck report of Harmonization of HE residual coding and NSQT by


Qualcomm and LGE (JCTVC-G750) [J. Kim, M. Kim] [late]

General
G201, G370, G554, part of G520
Group together the bypass bins as done in some other places
Decision: Adopted.
Remark: MSBs first for last_significant_coeff_x, _y, please
Decision: Adopted.
G239, G520, G704
Change of binarization to reduce bin count, reduction of number of contexts, simplification
of context selection
G239 was reported to do both, but contrib does not report the average (max 5 CABAC
coded bins for something rather than 16, no LUT for CABAC bins, adds 2*4 contexts,
0.1% improvement in coding eff)
G704 similar (drops 2 contexts)
G520 same binarization as somewhere else in the standard, reduction in number of
contexts (loss of efficiency 0.06-0.08%, up to 0.23% loss if the number of contexts is
reduced – asserted to have no loss if number of contexts is not reduced, but that doesn’t
seem to be described).
Decision: Adopt G704 (with adjustments above and below).
Remark: Harmonize unary code convention – should use 1's followed by a 0. Decision:
Agreed.

Page: 195 Date Saved: 2011-12-04


5.14 Intra prediction and intra mode coding (ref CE6)

5.14.1 Intra prediction

5.14.1.1 Predicting chroma from other colour channels


Review requested by BoG on chroma intra prediction (A. Tabatabai).
JCTVC-G119 also discusses this topic; see the section on that contribution.

JCTVC-G173 Cross-channel intra chroma residual prediction [Y. Chiu, Y. Han, L. Xu, W.
Zhang, H. Jiang (Intel)]

JCTVC-G676 Non-CE6: Crosscheck for Intel's Intra Chroma Prediction in JCTVC-G173


[M. Guo, X. Guo (MediaTek)] [late]

JCTVC-G244 Luma-based chroma prediction – Model correction [C. Gisquet, E. François


(Canon)]

JCTVC-G646 Crosscheck of JCTVC-G244 – Luma-based chroma prediction - Model


correction [M. Budagavi (TI)] [late]

JCTVC-G270 Non-CE6a: Cross-checking of JCTVC-G244 – Luma-based chroma


prediction - Model correction [P.Bordes, P.Salmon (Technicolor)] [late]

JCTVC-G346 Chroma intra prediction based on residual luma samples [K. Kawamura, T.
Yoshino, H. Kato, S. Naito (KDDI)]

JCTVC-G069 Cross-verification of KDDI's proposal on intra coding (JCTVC-G346)


[Masaaki Matsumura, Shohei Matsuo, Seishi Takamura, Hirohisa Jozawa
(NTT)]

JCTVC-G911 Cross verification of KDDI’s Chroma intra prediction based on residual


luma samples (JCTVC-G346) [Y. Chiu, W. Zhang, L. Xu, Y. Han (Intel)]
[late]

Page: 196 Date Saved: 2011-12-04


JCTVC-G245 Non-CE6a: Use of chroma phase in LM mode [E. François, C. Gisquet, S.
Pautet (Canon)]

JCTVC-G511 Performance Evaluation of Luma-based Chroma Intra Prediction [K. Sato


(Sony)]

JCTVC-G1009 A joint contribution on the coding tools of residual prediction for intra
chroma prediction [Y. Chiu, Y. Han, L. Xu, W. Zhang, H. Jiang (Intel), K.
Kawamura, T. Yoshino, H. Kato, S. Naito (KDDI)] [late]

JCTVC-G1024 Report of combining two coding tools for Chroma intra prediction (G173
and G358) [Xingyu Zhang, Oscar Au, Xing Wen, Yi-Jen Chiu, Yu Han,
Lidong Xu, Wenhao Zhang] [late]

JCTVC-G419 Inconsistency of intra LM mode between HM and WD [J. Lee, S.-C. Lim, H.
Y. Kim, J. S. Choi (ETRI)]
Partly on padding, and partly on prediction of chroma from luma.

JCTVC-G886 Cross-verification of ETRI's Inconsistency of intra LM mode between HM


and WD (JCTVC-G419) [T. Lee, J. Chen, J. H. Park] [late]

JCTVC-G1037 Non-CE6: Crosscheck of JCTVC-G419 with alpha7bit bugfix [K Sato


(Sony)] [late]

JCTVC-G358 New modes for chroma intra prediction [X. Zhang, O. C. Au, J. Dai, F. Zou,
C. Pang, X. Wen (HKUST)]

5.14.1.2 Other chroma intra prediction


Review requested by BoG on chroma intra prediction (A. Tabatabai).

JCTVC-G273 Crosscheck for JCTVC-G358 new modes for chroma intra prediction [J.
Dong (InterDigital)] [late]

Page: 197 Date Saved: 2011-12-04


JCTVC-G1030 Crosscheck for Canon's, Intel's, Mitsubishi's, NHK's integration of several
chroma coding tools in JCTVC-G955 [T.-D. Chuang, Y.-W. Huang
(MediaTek)] [late]
qq

JCTVC-G955 Joint contribution on the integration of several chroma coding tools [Gisquet
Christophe (Canon), Chiu Yi-Jen (Intel), Minezawa Akira (Mitsubishi),
Ichigaya Atsuro (NHK)] [late]

JCTVC-G995 Cross-verification of joint contribution on the integration of several chroma


coding tools (JCTVC-G955) [A. Minezawa, K. Sugimoto, S. Sekiguchi
(Mitsubishi)] [late]

5.14.1.3 SDIP-related

JCTVC-G354 Non-CE6: Improvements for SDIP [J. Xu, E. Maani, A. Tabatabai (Sony)]
Discusses the new part of G558.
In this proposal, two modifications are proposed relative to SDIP for coding efficiency improvement.
First, MPMs in intra mode coding for non-square CU are modified and achieves BD rate saving −0.04%
for AI_HE and −0.1% for AI_LC. Second, contexts for the significance map of non-square TUs are
redefined, which provides additional −0.1% for AI_HE. Combing both algorithms, there is −0.14% BD-
rate saving for AI_HE.
Within the context of SDIP, this was agreed to be an improvement of the reference design.

JCTVC-G804 Crosscheck for Sony's JCTVC-G354 on Improvements for SDIP [C. Lai, L.
Liu, J. Zheng (HiSilicon)] [late]

JCTVC-G135 Non-CE6: Rectangular (2NxN and Nx2N) Intra Prediction [S. Liu,
X. Zhang, Z. Zhou, S. Lei (MediaTek)]
This contribution proposes to add two new intra prediction modes, i.e. 2NxN and Nx2N Intra prediction
to the existing 2Nx2N and NxN (only in SCU) intra prediction modes in current HM. Experimental
results report an average 1.6% BD rate reduction for AI HE, with encoder run-time increase 31%; or
average 1.52% BD rate reduction for AI HE, with an encoder run-time increase 27% (with HM 4 non-
SDIP branch as the anchor). Decoding run-time is increased by 3% on average. All results are generated
by current software implementation; further coding efficiency improvement and/or implementation
complexity reduction are reported to be expected with further investigation.
In intra, currently, "PU" is the level at which the prediction type is indicated, and "TU" is the level at
which the prediction type is operated.
Although the title of the proposal refers to 2NxN and Nx2N, at the TU level the sizes are 2Nx(N/2) and
2Nx(N/4) and the rotated equivalents. At PU 32x16, the TU is 32x8 and 32x2; at PU 16x8, the TU is
16x4; at PU 8x4, the TU is 8x2.

Page: 198 Date Saved: 2011-12-04


It was commented that harmonizing the set of segmentations between intra and inter is desirable. The
degree of alignment between this and SDIP and the inter NSQT and RQT was discussed.
The proposal did not include support at the 64x64 level. It was suggested that the reason that this was not
used was encoder search complexity, but that perhaps it would be best not to do something architecturally
different from the decoder perspective.
In the proposal, a subset of the intra prediction modes was searched, while the syntax is proposed to
support a full set of 35 modes.
It was remarked that the encoder search is a significant consideration in these comparisons.
The cross-checker remarked that there is substantial commonality between this proposal and the prior
SDIP scheme; for 8x2 this scheme searches more modes.
Interest was expressed in obtaining test results that included cases other than AI, and with NSQT in
particular.
The proponent acknowledged that such study was desirable.
Further study in a CE (as in prior CE6) was encouraged.
The AHG on block sizes and partitions was encouraged to consider issues of harmonization in both
directions between intra and inter designs.

JCTVC-G809 Crosscheck of JCTVC-G135 on rectangular (2NxN and Nx2N) intra


prediction from MediaTek [C. Auyeung (Sony)] [late]

JCTVC-G965 Cross-check report for MediaTek rectangular Intra prediction by Motorola


Mobility [J. Lou, L. Wang (Motorola Mobility)] [late]

JCTVC-G598 Non-CE6: Intra prediction based on weighted template matching predictors


(WTM) [T. Guionnet, L. Guillo (INRIA)]
This contribution presents an intra prediction method based on weighted template matching predictors
(WTM) and the results of its integration in HM 4.0. The resulting codec reportedly always improves the
intra prediction. Without the class F, the average BD-rate gain is 0.5% for High Efficiency (HE) and 0.6%
for Low Complexity (LC) profiles. With the class F, the average BD-rate gains are 1% for HE and 1.1%
LC. BD-rate gains within only class F are 3.1% and 3.4% for HE and LC, respectively. Outside from
class F, it was reported that class C provides the largest benefit.
Preliminary tests were run when SDIP was jointly activated with WTM in HM4.0. BD-rate gains were
reportedly largely cumulative.
Decoding time increase is reported as 4%, with 15–20% encoding time increase.
In this proposal, the shape of the template is signalled by syntax (which involves encoder search).
There were some prior template matching proposals (not at the last few meetings) – although it was
remarked that those were more complex.
There were some remarks that we seem to be entering a stage of work where more of a focus on
stabilization is needed, so it seems unlikely that such a significant modification would be included in the
design (in a first phase of standardization).

Page: 199 Date Saved: 2011-12-04


JCTVC-G604 Non-CE6: Cross-checking of JCTVC-G598: Intra prediction based on
weighted template matching predictors (WTM) [P Bordes, P Salmon
(Technicolor)] [late]

JCTVC-G754 Non-CE6: Line buffer reduction for CABAC context of SDIP syntax [L.
Guo, X. Wang, M. Karczewicz (Qualcomm)]
SDIP (short-distance-intra-prediction) introduces two extra syntax elements sdip_flag and sdip_direction.
The CABAC context modeling of these two syntax elements involves the corresponding syntax values
from above blocks, and thus introduces line buffer storage. This contribution describes modified CABAC
contexts for sdip_flag and sdip_direction that avoid upper block data and thus eliminates the line buffer
for these two syntax elements. The average B-D rate change is 0.0% and no encoding/decoding time
change is observed.
It was commented that coding the segmentation as mode-level information without adding other context
models is preferable to using the additional context models for sdip_flag and sdip_direction. If that
suggestion is pursued, this modification would not be necessary. This seems desirable to study.
However, without that suggestion incorporated – within the context of SDIP, this was agreed to be an
improvement of the reference design.

General

It was agreed that our primary SDIP reference design is G558 + G754.
Consider at plenary level whether SDIP should be in WD 5.

5.14.1.4 Intra prediction complexity reduction


BoG on Intra prediction complexity reduction and filtering (R. Joshi)

JCTVC-G145 Non-CE6: Reducing Line Buffers for intra mode [J. Lim, Y. Jeon, S. Park,
B. Jeon (LG)]

JCTVC-G474 Non-CE6b: Crosscheck for LG's intra mode line buffer reduction in
JCTVC-G145 [T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G447 Reduced number of intra 64x64 prediction mode [K. Sugimoto, A.


Minezawa, S. Sekiguchi (Mitsubishi)]

JCTVC-G193 Cross-verification result of JCTVC-G447 on reduced number of intra 64x64


prediction mode [K.Kazui (Fujitsu)]

JCTVC-G449 Reduced number of intra chroma prediction mode [K. Sugimoto, A.


Minezawa, S. Sekiguchi (Mitsubishi)]

Page: 200 Date Saved: 2011-12-04


JCTVC-G840 Cross-verification of Mitsubishi’s reduced number of intra chroma
prediction mode (JCTVC-G449) [T. Lee, J. Chen, J. H. Park (Samsung)]

JCTVC-G567 Simplified DC prediction [J. Lainema, K. Ugur (Nokia)]

JCTVC-G966 Non-CE6: Cross-verification of Nokia's proposal on Simplified DC


prediction (JCTVC-G567) [K. Sugimoto, A. Minezawa, S. Sekiguchi
(Mitsubishi)] [late]

5.14.1.5 Intra prediction filtering

JCTVC-G139 Non-CE6.d: Intra Prediction With Selective Secondary Boundary [G. Van
der Auwera, M. Karczewicz (Qualcomm)]

JCTVC-G960 Cross-check for JCTVC-G139 (Non-CE6.d: Intra Prediction With Selective


Secondary Boundary) [J. Zhao, A. Segall (Sharp)] [late]

JCTVC-G373 Non-CE6: Performance of secondary boundary DC intra prediction [C.


Auyeung (Sony)]

JCTVC-G861 Crosscheck results of JCTVC-G373 on the performance of secondary


boundary DC intra prediction from Sony [X. Zhang, S. Liu (MediaTek)]
[late]

JCTVC-G443 Improved directional intra prediction smoothing [A. Minezawa, K.


Sugimoto, S. Sekiguchi (Mitsubishi)]

JCTVC-G896 Cross verification of improved directional intra prediction smoothing


(JCTVC-G443) [J. Lainema, K. Ugur (Nokia)] [late]

5.14.1.6 Angular intra prediction


Also see notes for G119.

Page: 201 Date Saved: 2011-12-04


JCTVC-G350 Modification of angular intra prediction [S. Matsuo, S. Takamura, Hirohisa
Jozawa (NTT)]
This document introduces an angular intra prediction method exploiting interpolation technique whose
tap length is more than 2. In the proposed method, reference samples for angular intra prediction are
generated by either 4-tap DCT-IF or conventional 2-tap filter. The proposed intra prediction is applied to
chrominance components as well as luminance component. Compared to the HM4.0 anchor, in the HE
configuration, the overall average coding gains of Y, Cb and Cr were about 0.4%, 0.5% and 0.5%,
respectively. Regarding LC configuration, those of Y, Cb and Cr were about 0.5%, 0.6% and 0.6%,
respectively. The maximum coding gains of Y, Cb and Cr were about 1.9%, 1.8% and 2.2%, respectively,
for the sequence “BasketballDrill” in LC case. The encoding and decoding run-times of the proposal were
about the same as the anchor on average, respectively. Additional experimental results reportedly showed
that the proposal provided about 0.1 to 0.4% coding gain on average for inter-frame coding.
Proposed for small blocks because not as beneficial for large blocks.
Different filters for each position – for example, 16 filters of 4 taps each – with symmetry applied to
produce 31 filters.
Due to concerns of complexity and design stability, interest was not expressed by non-proponents.

JCTVC-G348 Cross-check report of NTT's intra prediction approach (JCTVC-G350) [T.


Yoshino, S. Naito (KDDI)]

JCTVC-G374 Improving the Intra Prediction Based on a Uniform Probability Model [L.
Liu (HiSilicon and Huawei)]
The prediction values of intra prediction are calculated through the predefined angTable table in the HM.
A performance decrease is observed when there are varied diagonal textures. This contribution presents a
proposed additional table based on a uniform probability model. It is proposed to select either the original
and newly-proposed angTable value depending on the left and above intra prediction modes.
No benefit was shown for most sequences, but 0.9% benefit was shown for AI HE and 0.8% AI LC for
BasketballDill.
The proponent indicated that this method is just one potential solution for this issue, and other approaches
might be possible. The purpose of the proposal was essentially to provide information to point out that the
current table does not seem to fit the characteristics of all video sequences – perhaps because of the
particular effective angles of the current table.

JCTVC-G762 Cross-check of JCTVC-G374 on improving Intra prediction [H. L. Tan, C.


Yeo (I2R)]

JCTVC-G481 Cross verification of Huawei’s intra prediction improvement based on a


uniform probability model (JCTVC-G374) by Intel [Y. Chiu, L. Xu (Intel)]
[late]

JCTVC-G738 On angular intra prediction main array extension [M. Coban (Qualcomm)]
This contribution presents a proposed simplification to the subsampling process of the main array
extension scheme by using reduced-precision slope computation. The current design uses 12-bit precision
slope tables to compute the subsampling positions given a block size and the prediction angle. Using 8-bit

Page: 202 Date Saved: 2011-12-04


precision slope tables has negligible impact on coding efficiency. Simulation results reportedly show
0.00%, and 0.00% BD rate change for AI HE and AI LC cases, respectively.
It was remarked that the current tables are essentially an editorial convenience to provide an equivalent of
a true division operation. The specification could just describe the process as a division instead of
providing the tables. It was also remarked that, in some implementations, the operation would not be done
as written and so it seems questionable whether the proposal provides any actual benefit to
implementations. The proposed modification would make the operation become no longer equivalent to a
true division. No action was taken. The editors are requested to study the subject and determine the best
approach to specifying this part of the design.

JCTVC-G882 Cross-verification report for 8-bit precision slope computation on angular


prediction array extension (JCTVC-G738) [H. Aoki, K. Chono (NEC)] [late]

5.14.1.7 Padding

JCTVC-G791 Non-CE6: Simplified reference samples padding for intra prediction [T. Lee,
J. Chen, J. H. Park (Samsung)]
This contribution removes two sides checking of unavailable pixels range in reference samples padding in
HM4.0 to reduce the complexity. The unavailable pixels range are padded by the nearest available pixel
in one fixed direction instead of adapting direction or averaging by the nearest available pixels. It is
reported that no BD-rate loss is observed for all configurations (AI, RA, LB in HE, LC) with 1500-byte
slice mode setting or/and constrained intra prediction setting.
Two methods were tested. An additional method was added in a later revision.
The quality of the affected part of the software was discussed.
It was asked whether any testing was done for increased intra refresh; this had not been tested.
Testing was done with and without constrained intra prediction.
It was commented that Method 2 is appealing (simpler than Method 1, and both have no loss).
See notes below regarding G812.

JCTVC-G120 Cross-verification of Samsung's proposal on padding process (JCTVC-


G791) [Y. Lin, C. Lai (HiSilicon)] [late]

JCTVC-G812 AHG16: Padding process simplification [X. Wang, W. J. Chien, M.


Karczewicz (Qualcomm)]
This contribution provides a single-pass padding scheme for generating reference samples used for intra
prediction. With the proposed scheme, memory access is more regular and consistent than the current one
in HM4.0. Simulation results reportedly show that with constrained intra prediction enabled and 1500
bytes/slice, its impact on coding performance with HE configurations is on average 0.00% for all intra,
0.04% for random access and 0.01% for low delay. For LC configurations, it is on average 0.00% for all
intra, 0.06% for random access and 0.01% for low delay.
Average 0.07 benefit and chroma PSNR improvement reported in AI.
It was asked whether there might be any visual artifacts with this method or G791.
Suggestion: Choose based on max per-sequence degradation among all configs and 1500 byte slices
between G812 Scheme A and G791 Method 2.

Page: 203 Date Saved: 2011-12-04


0.4% loss on CIP for one sequence in class E with 1500 bytes for G791 Method 2.
Decision: Adopt G812 Scheme A.

JCTVC-G916 AHG16: Crosscheck of JCTVC-G812 on Padding process simplification [X.


Li, X. Guo (MediaTek)] [late]

JCTVC-G572 AHG16: Reference sample padding harmonization for intra DC mode [V.
Wahadaniah, C. S. Lim (Panasonic)]
This contribution reports the HM-4.0 simulation results of a revised reference sample padding scheme for
intra DC mode initially presented in JCTVC-F414. The revised padding scheme is reported to give
average BD-rate gains of 0.1% for intra-only setting with 1500-byte slices and constrained intra
prediction enabled. It is reported that encoding and decoding runtimes are not affected by the revised
padding scheme. It was proposed that JCT-VC consider the trade-off between design simplicity and
conceptual accuracy of intra DC prediction and subsequently decide whether a revised padding scheme
for intra DC prediction is desirable.
It was remarked that although this proposal may provide some (small) benefit, the current scheme is a
"cleaner" design that avoids undesirable interactions. No action.

JCTVC-G897 AHG16: Cross-verification and additional results of reference sample


padding harmonization for intra DC mode (JCTVC-G572) [K. Chono, H.
Aoki (NEC)] [late]

5.14.2 Intra mode coding [done – see G1017]


Also see JCTVC-G1017 BoG report (V. Sze).
[include summary table from BoG report]

JCTVC-G109 On Intra Mode Mapping [X. Zhang, S. Liu, S. Lei (MediaTek)]

JCTVC-G148 Non-CE6: Cross-check of MediaTek’s MPM mapping by LG (JCTVC-


G109) [J. Lim, S. Park, B. Jeon (LG)]

JCTVC-G119 Modifications to Intra-frame coding [Y. Lin, H. Yang, L. Liu, J. Zheng, H.


Yu (Huawei)]
The first part of this is related to predicting chroma using luma, to be reviewed in BoG on chroma intra
prediction. This aspect to be reviewed in chroma intra prediction BoG.
A second part is about how to code intra modes. This aspect was reviewed in the intra mode coding BoG.
A third aspect is to remove one mode and modify the prediction formed in another mode.

Page: 204 Date Saved: 2011-12-04


JCTVC-G117 Non-CE6: Cross-verification of Huawei's Modifications to intra frame
coding (JCTVC-G119) [K. Chono, H. Aoki (NEC)]

JCTVC-G144 Non-CE6: on MPM mapping [J. Lim, S. Park, B. Jeon (LG)]

JCTVC-G473 CE6b: Crosscheck for LG's MPM mapping modification in JCTVC-G144


[T.-D. Chuang, Y.-W. Huang (MediaTek)]

JCTVC-G153 Non-CE6: On intra prediction mode coding [C. Yeo, H. L. Tan, Y. H. Tan,
Z. Li (I2R)]

JCTVC-G106 Cross-Check for G-153: On Intra Mode Coding [A. Saxena, F. Fernandes
(Samsung)]

JCTVC-G254 Non-CE6b: Cross-check of JCTVC-G153 on MPM index bypass coding


from I2R [E. François (Canon)]

JCTVC-G184 Non-CE6: Unified neighboring positions for intra mode coding [S.
Fukushima, H. Nakamura (JVC Kenwood)]

JCTVC-G475 Non-CE6b: Crosscheck for JVC KENWOOD's unified neighboring


positions for intra mode coding in JCTVC-G184 [T.-D. Chuang, Y.-W.
Huang (MediaTek)]

JCTVC-G185 Non-CE6: On intra chroma mode coding [T. Kumakura, S. Fukushima


(JVC Kenwood)]

JCTVC-G784 Crosscheck on JCTVC-G185 Non-CE6: On intra chroma mode coding [W.-


J. Chien]

JCTVC-G488 Non-CE6a : Cross check on use of chroma phase in LM mode (JCTVC-


G245) [T. Guionnet, L. Guillo (INRIA)]

Page: 205 Date Saved: 2011-12-04


JCTVC-G756 Non-CE6: Crosscheck of Canon's improvement on LM mode (JCTVC-
G245) [X. Zheng] [late]

JCTVC-G359 Non-CE6: Coding of luma intra prediction modes that are not in the MPM
set [R. Cohen, X. Xu, A. Vetro, H. Sun (MERL)]

JCTVC-G456 Non-CE6: Cross check report of MERL’s intra prediction mode coding
(G359) [Atsuro Ichigaya (NHK)]

JCTVC-G377 Non-CE6: Cross-check report of proposal on coding of luma intra prediction


modes that are not in the MPM set (JCTVC-G359) [Y. Shibahara, T. Nishi
(Panasonic)]

JCTVC-G418 Simplification of intra prediction mode mapping method [J. Lee, S.-C. Lim,
H. Y. Kim, J. S. Choi (ETRI)]

JCTVC-G870 Cross-check report of JCTVC-G418 [K Sato (Sony)] [late]

JCTVC-G423 Non-CE6: Remove potential duplicate modes from the candidate mode list
for chroma intra prediction [H. Yang, J. Zhou, H. Yu (Huawei)]

JCTVC-G408 Non-CE6: Cross-check of Remove potential duplicate modes from the


candidate mode list for chroma intra prediction (JCTVC-G423) [B. Li
(USTC), J. Xu (Microsoft)]

JCTVC-G707 Using CABAC bypass mode for coding intra prediction mode [K. Misra, A.
Segall (Sharp)]

JCTVC-G767 Non-CE1: Bypass coding of Intra prediction modes in CABAC [T. Lee, J.
Chen, J. H. Park (Samsung)]

Page: 206 Date Saved: 2011-12-04


JCTVC-G871 Non-CE6: Crosscheck of Qualcomm's buffer reduction for CABAC context
of SDIP syntax (JCTVC-G754) by HKUST [Xing Wen, Oscar Au, Xingyu
Zhang] [late]

5.15 Transforms

5.15.1 Core transform implementation (ref CE10)

JCTVC-G272 Non-CE10: Core Transform Design for HEVC [J. Dong, Y. Ye


(InterDigital)]
This contribution presents a core transform design for HEVC, including 4×4, 8×8, 16×16 and 32×32
forward/inverse transforms. The proposed design has the following properties:
 16 bit data representation before and after each transform stage (which the proponent contrasted
with G737)
 Transforms can be implemented by full factorization, partial butterfly, or matrix multiplication (a
property shared by G737 but not G495)
 4×4 and 8×8 transforms are orthogonal (unlike in some other proposals such as G737 and G495),
and 16×16 and 32×32 transforms are asserted to be nearly orthogonal
 N×N transform matrix is reused as the even part of the 2N×2N transform matrix (a property
shared by the other major candidates)
 The norms of basis vectors are not equal, but are asserted to be sufficient similar to avoid
frequency-specific scaling.
This design was integrated into HM4.0, using both full factorization and partial butterfly. It is reported
full factorization for 32×32 transform achieves 6X reduction in the number of multiplications, compared
with partial butterfly. The R-D performance was tested under the common test conditions specified in
JCTVC-F900. The average BD-rate increase is 0.1% to 0.2% for all intra coding, 0.1% for random access
coding, and 0.0% to 0.1% for low delay coding.
The number of shifts was not presented, and this would be important to know.
It was asked whether there are test results for such scenarios as low-QP and high-QP – such information
was not yet available.
A small loss was observed in coding efficiency.

JCTVC-G386 Non-CE10: Cross Check Report for JCTVC-G272 Core Transform Design
for HEVC. [X. Zhang, O. C. Au, X. Wen (HKUST)] [late]
Revisit after survey of inputs.

JCTVC-G333 Accuracy improvement of HM's transform bases [Y. Sugito, A. Ichigaya, S.


Sakaida (NHK)]
This contribution proposes 16x16 and 32x32 transform bases based on the transform proposed by Cisco
and TI (G495). The bases are asserted to have near orthonormality. To derive transform bases, two
options are used. Option 1 is the same as JCTVC-F193. Option 2 has a feature that the value range of
elements is wider than Option 1 and waveform of basis is made consideration. Proposed bases are
reported to have been verified according to CE10's coding efficiency of normal QP range and low QP
range. The results for the common conditions QP range (QP=22, 27, 32, 37) reportedly show around 0%

Page: 207 Date Saved: 2011-12-04


average bit rate impact for luma. Almost all of the results of sequences for the low QP range (QP=1, 5, 9,
13) reportedly show gains for luma; and the peak gain is about 1.3% without any reported process time
increase.
Does not have full-factorization implementation approach (roughly like G495 in concept).
(Higher precision; probably increased operations or wordlength or less flexible implementation
opportunities)
Revisit after survey of inputs.
2

JCTVC-G625 Cross-check of JCTVC-G333, "Accuracy improvement of HM's transform


bases" [A. Fuldseth (Cisco)] [late]

JCTVC-G496 Core transform design for HEVC with 7 bit coefficients [A. Fuldseth, G.
Bjøntegaard (Cisco), M. Budagavi, V. Sze (TI)]
This contribution proposes a set of 7 bit transform matrices for HEVC, covering all transform sizes from
4x4 to 32x32. The proposed transform matrices were asserted to have the same properties as the 8 bit
transform matrices currently used in the HM transforms. The transform matrices and the associated
transform operations described in this contribution are proposed for the core transform design in HEVC.
The proposed transform design was reported to have the following properties: 16 bit data representation
before and after each transform stage (independent of the internal bit depth), 16 bit multipliers for all
internal multiplications, no need for correction of different norms of basis vectors during
quantization/dequantization, all transform sizes above 4x4 can reuse arithmetic operations for smaller
transform sizes, and implementations using either pure matrix multiplication or a combination of matrix
multiplication and butterfly structures are reportedly possible. BD-rate results vary between −0.1% and
0.2% for the average across all sequences and all classes for the low (−1, 5, 9, 13), “normal”
(22, 27, 32, 37) and high (36, 42, 47, 51) QP ranges. The 7 bit transform matrices reportedly offers
between 9% and 23% reduction in hardware costs when compared to the 8 bit versions currently used in
HM.
The contribution indicated that this should be considered an information contribution, pending conclusion
on CE10 work.
A participant commented that in addition to the average results, the worst case is interesting.
Another participant indicated that the RDO does not actually function properly for low QP. It would be
highly desirable to fix these issues. However, others commented that although the RDO is flawed in this
range, the results seem to tend to be consistent with expectations based on knowledge of transform
tradeoff characteristics.

JCTVC-G335 Cross-check on 7 bit transform matrices (JCTVC-G496) [Y. Sugito, A.


Ichigaya (NHK)]

JCTVC-G628 Comparison of core transform proposals [A. Fuldseth, G. Bjøntegaard


(Cisco), M. Budagavi (TI)]
Notes elsewhere.

Page: 208 Date Saved: 2011-12-04


JCTVC-G782 AhG7: Overflow Prevention in HEVC inverse transform [E. Alshina, A.
Alshin, J.H. Park (Samsung)]
This contribution proposes a modification for the HEVC inverse transform framework which reportedly
allows preventing overflow of 16 bits in 3 temporal buffers by using no more than 2 clipping operations.
Proposed modification guarantees all temporal buffers within 16 bits even for random input for inverse
transform (non-confirmed bit streams). Performance test show no change under common test conditions
due to proposed modification.
Additionally 32 bit overflow in intermediate calculation for JCTVC-F251 core transform is studied here.
The same solution with no more than 2 clipping operations per pixel guarantees no 32 bits overflow in
register for JCTVC-F251. Performance of JCTVC-F251 doesn’t change due to proposed modification.
Additional test results for 12 bits internal bit-depth show noise level change of BD rate (0.00% in
average) due to proposed change for both HM4.0 and JCTVC-F251 inverse transforms.
It was proposed to specify to clip after the dequant and after the first stage of the inverse transform. The
described clipping range might not be a power of two.
It was commented that, especially in software, a non-power-of-two clipping would be more complex than
a power-of-two clipping range.
It was commented that with some transform designs, there would be cases where the input data cannot be
represented through the transform with some clipping ranges. For example, DC values above 140 would
become distorted when using the scheme proposed for G737 (and other cases exist).
The proponent indicated that the G737 transform coefficients for the FF variation could be down-shifted
by 1 bit to prevent this. It was asserted that this would not substantially degrade the performance of the
modified transform and would reduce gate count.
Decision: It was agreed to put signed 16 b range clipping as a “normative” result requirement at the
output of the first inverse transform.
It was remarked that column-inverse-first is likely to work better than row-inverse-first.

JCTVC-G719 Structured Level Limits [L. Kerofsky, A. Segall, K. Misra (Sharp), E.


Alshina, A. Alshin (Samsung)]
This contribution proposes limits on the quantization levels. These levels vary with QP parameter and
transform size. It is asserted that these limits ensure the dequantized coefficient do not exceed 16-signed
bits in dynamic range. A formula defines the limits from 6 basic values using the QP and transform size
parameters used in dequantization. The limits are proposed for use in clipping decoded levels to the range
specified by the limits and removing the clipping following dequantization It is additionally proposed to
place a limit on the maximum absolute value of level independent of QP and block size.
It appeared that the proposed pre-dequant clipping would not function properly when quantization
weighting matrices are used.
Decision: It was agreed that we should normatively forbid level values outside of some range. The precise
range should be 16 b (for now).
Note: Does the QP range in HEVC change with bit depth as in AVC? Should it? This should be studied.

Question: Why impose as a decoder operation rather than requiring the encoder to obey a constraint?
It was suggested that clipping in the decoder is more robust; probably decoders would do this anyway, in
order to cope with badly-designed encoders that don’t obey the specified limits.
Decision: We should specify clipping of the output of the dequant (i.e., the input to the inverse transform)
to a signed 16 b range.

Page: 209 Date Saved: 2011-12-04


JCTVC-G856 AHG7: IDCT output range after T+Q+IQ+IT with valid residual inputs [M.
Zhou (TI)]
Describes IDCT output range with “reasonable” quantization – 5 bits expansion for valid input.

JCTVC-G497 SIMD optimization of proposed HEVC core transforms [A. Fuldseth, L. P.


Endresen, S. Selnes (Cisco), V. Arbatov, F. Franchetti (SpiralGen), M.
Puschel (ETH Zurich)]
Effectively covered in CE 10 report.

JCTVC-G811 Crosscheck of JCTVC-G497 on SIMD optimization of core transforms [C.


Auyeung (Sony)] [late]

JCTVC-G601 Cross verification of SIMD-optimized partial butterfly HM (G495)


transforms [D. Flynn (BBC)] [late] [upload 2011-11-24 10:40:36]
Reports some SIMD analysis on Intel SSE2 (reportedly 2nd most common architecture), saying that PB
approach was significantly faster for SIMD than other implementations and is identical for G495 and
G737. The FF approach was reported as having characteristics that were unfriendly in several ways (e.g.
ordering/shuffling and dependency chain effects preventing effective use of pipelining).

JCTVC-G132 Hardware analysis of transform and quantization [M. Budagavi (TI), V. Sze
(TI), M. Sadafale]
Effectively covered in CE 10 report. Describes concept behind assertion of sharing of all multipliers and
some adders between encoder and decoder, involving transform matrix symmetry.
G495 has this property, and in G628 it was asserted that G737 does not.
It was asserted by the proponent of G737 that a (somewhat different) form of sharing of multipliers was
possible with the FF form of that transform.
Describes two hardware implementations based hard-wired and SIMD architecture (e.g. useful for multi-
standard processors) and asserts that the G495 design is better for this.
The specifics of these implementations were not released for the hand-optimized designs of either G495
or G737.

JCTVC-G263 Hardware complexity analysis of spatial transform [X. Tian (Altera)]


Effectively covered in CE 10 report.

JCTVC-G265 Core Transform Property for Practical Throughput Hardware Design [M.
Tikekar, C.-T. Huang, C. Juvekar, A. Chandrakasan (MIT)]
Summarized as follows:
 Implemented all transform sizes (and DST and non-square and control logic) for G495
 Emphasizes number of unique coefficients, for which a substantial decrease (25%) of complexity
for G495 implementation was achievable by taking advantage of this property. (The proponent of
G737 asserted that this could apply also to that proposal.)

Page: 210 Date Saved: 2011-12-04


 Throughput of 2 samples per cycle at 200 MHz (which would be enough for 30 fps 4kx2k). It was
remarked that the CE10 plan requested input for a range of frequencies 150-500 MHz with up to
32 samples.

JCTVC-G857 SIMD Analysis of Some Core Spatial Transforms [S. Riabtsev (CSR)] [late]
[uploaded 2011-11-24 15:41:16]
Presenter not avilable.
Some of the results in this had been made available in the CE 10 report prior to availability.
Reports that both G737 and G495 can be implemented in real time and comments that the FF form of
G737 is the preferred form for hardware.
Discussion of this was requested.

JCTVC-G990 Cross check of inverse transfom complexity reported in JCTVC-G757 [A.


Fuldseth (Cisco)] [late]
Effectively covered in CE 10 report.

JCTVC-G865 New Results for Guaranteeing 16-bit Transform Dynamic Range [K. Misra,
L. Kerofsky, A. Segall (Sharp)] [late]
Requests action already taken as recorded above for G782 and G719 in regard to decoder clipping after
dequant and after the first stage inverse transform (16 b clip in each place).

JCTVC-G977 Non-CE10: new 8-bits core spatial transform with fast algorithm [Hongbo
Zhu] [late]
Not yet presented (and late) – presenter unavailable when presentation requested. May be studied as
information.

Conclusions

Since G495 and G737 are the only candidate technologies that have been well-studied, other new
proposals would not be appropriate if we wish to make a decision at this meeting.
Both G495 and G737 seem like essentially good candidate designs.
Some submitted non-proponent analysis may indicate some preference for G737 in hardware (although
another non-proponent indicated that their internal analysis had led to a different conclusion). No
significant difference was observed between the two for software SIMD architecture.
G737 seems to need some modification as a bug fix, which would involve some loss of precision.
Given that scenario and the desire to try to close the topic and move on, the consensus was to select
G495.
Decision: Adopted G495.

5.15.2 Alternative transforms


[include table from CE report]

Page: 211 Date Saved: 2011-12-04


JCTVC-G281 Non-CE7: Boundary Dependent Transform for Inter Predicted Residue [J.
An, X. Zhao, X. Guo, S. Lei (MediaTek)]
Alternative transform for inter, with 0.2% benefit for RA and 0.5% for LD. For further study (along with
G632).

JCTVC-G631 Cross-Check of G281: Boundary Dependent Transform for Inter Predicted


Residue [A. Saxena, F. Fernandes (Samsung)] [late]

JCTVC-G632 Non CE 7: On secondary transforms for inter prediction residual [A.


Saxena, F. Fernandes (Samsung)]
Supplements G281 concept, only tested with length 4 and 8, with gain of 0.4% relative to HM for LD LC.
For further study along with G281.

JCTVC-G282 Non-CE7: Mode Dependent DCT/DST for Chroma [X. Zhao, X. Guo, M.
Guo, S. Lei (MediaTek), S. Ma, W. Gao (PKU)]
Application of DST to chroma, with 0.5% benefit for U and V components. G107 tested several cases,
and this is one of them.

JCTVC-G879 Cross-Check of G282 [A. Saxena, F. Fernandes] [late]

JCTVC-G345 Non-CE7: Restricted mode-dependent 8x8 DST for Intra prediction [A.
Ichigaya, Y. Sugito, S. Sakaida, (NHK)]
Introduces MDST 8x8 with restriction of combination of PU and TU size. Overall benefit (relative to HM
anchor) is small (0.2%), but Class A showed more gain (0.6%).

JCTVC-G630 Cross Check of G345: Restricted Mode-Dependent 8x8 DST for Intra
prediction [A. Saxena, F. Fernandes (Samsung)] [late]

JCTVC-G817 Non-CE7: Cross verification of JCTVC-G345, Restricted mode-dependent


8x8 DST for Intra prediction [R. Cohen (MERL)]

JCTVC-G591 Non-CE 7: Supplementary Results for the Rotational Transform [Zhan Ma,
Felix Fernandes, Elena Alshina, Alexander Alshin (Samsung)]
Describes a fast search for use of the syntax-signal ROT, provide 0.5% rate benefit at 10% encoder time
increase. (Without the fast search, the rate benefit is 0.9% at 29% encoder time increase.)

JCTVC-G735 Non-CE 7: Crosscheck of G597 Supplemental ROT Results [P. Topiwala


(FastVDO)] [late]

Page: 212 Date Saved: 2011-12-04


JCTVC-G629 Non CE 7: Harmonization of SDIP and Mode-Dependent Secondary
Transforms [A. Saxena (Samsung), Y. Shibahara (Panasonic), F. Fernandes
(Samsung), T Nishi (Panasonic)]
Applies 8x8 MDST (non-signalled type selection, 0.7%) to SDIP non-square blocks. Shows that gain of
MDST is additive to SDIP (0.8% additive gain on top of 1.3% SDIP gain). This supplements the
information collected in CE 7. This indicates that these two features work properly in combination.

JCTVC-G937 Cross-check of JCTVC-G629: Harmonization of SDIP and Mode-Dependent


Secondary Transforms [H. Yang (Huawei), C. Lai (Hisilicon)] [late]

5.16 IBDI and memory compression

JCTVC-G328 AHG8: Bit depth of output pictures [Y. Chen, Y. -K. Wang, X. Wang, I. S.
Chong, M. Karczewicz (Qualcomm)]
(Track B.) Skepticism was expressed regarding this as a normative output definition behaviour. For
further study.

JCTVC-G430 An experimental comparison of memory bandwidth between 8-bit and 10-bit


coding [T. Chujoh (Toshiba), K. Chono (NEC)]
(Track A.) An experimental comparison between 8-bit and 10-bit coding is reported. Although the
coding efficiency of 10-bit coding has been discussed at previous several meetings, the quantitative
discussion of memory bandwidth has not been sufficient. By using the measurement module of memory
bandwidth that has been developed by AHG8 of reference memory compression and has been distributed
to CE3 of motion compensation, the numerical result of memory bandwidth is shown. As experimental
results, the memory bandwidth increase of 10-bit coding compared to 8-bit coding is an average of 37%
and a maximum of 47%. Since the reductions of memory bandwidth have been discussed in AHG8 and
CE3, those discussions should be considered.
Valuable to have quantification of necessary memory bandwidth for N-bit case.
Refers to plenary discussion on HM settings.

5.17 Complexity assessment

JCTVC-G095 Virtual motion compensation memory bandwidth verifier (VMBV) [M.


Zhou (TI)]
This contribution proposes an virtual motion compensation memory bandwidth verifier (VMBV) aimed
to lower the worst case motion compensation memory bandwidth requirements without harming coding
efficiency or imposing significant burden on the encoder side. The proposed VMBV advocates: 1) to
define virtual maximum memory access bandwidth rate for profile & levels, say Rmax, measured in unit
of bytes per second; 2) to specify a virtual memory access bandwidth measurement tool which quantifies
the memory access bandwidth consumed by a decoded frame, say d n, measured in bytes; and 3) to impose
the VMBV conformance on bitstreams: for any frame the consumed memory access bandwidth d n shall be
less than or equal to R* Δtn, where R (R ≤ Rmax) is the virtual memory access bandwidth rate signaled in
sequence parameter set, and Δtn is time used for decoding the frame. It is recommended to conduct the
further study on this topic to specify the proper VMBV memory bandwidth measurement tool, and to
implement the VMBV constraint in the encoder motion estimation to investigate the practicality of the
proposed VMBV model.

Page: 213 Date Saved: 2011-12-04


Presentation not uploaded.
Idea behind: “Forbidding too evil bitstreams” e.g. random change of motion in neighboring CUs that
would impose increase in memory bandwidth
Difficult to specify maximum memory bandwidth such as maximum bitrate. Problem: How to measure
the memory bandwidth, as it is architecture dependent
Other possibilities: Restrict PU block size, restrict deviations of MVs
Further study (in the AHG on memory bandwidth issues)

JCTVC-G262 HM decoder complexity assessment on ARM [T. Anselmo, D. Alfonso


(STM)]
This contribution presents the results of an assessment of the HM 4.0 decoder complexity on ARM
architecture, in particular a Dual Cortex A9 CPU integrated in the ST-Ericsson Nova™ A9500
application processor.
(based on current reference software)
E.g. in RAHE interpolation 26%, ALF 15%, SA 4%, transform 1%. (MD5 8% which should be disabled
at the decoder)
The results show that the current complexity ratio between HE and LC configuration on ARM is on
average 1.218, with minimum and maximum values 1.075 and 1.409 respectively.

JCTVC-G757 On software complexity [F. Bossen (Docomo Innovations)]


Complexity is a complex topic. While the commonly used run time measurements obtained with the HM
software may be indicative of complexity, they often remain a poor proxy of actual complexity. A
software decoder was thus written from scratch with the goal of achieving real-time decoding on x86 and
ARM devices, and thereby provide better estimations of run times in real-world environments. This
contribution provides some insight on which tools significantly contribute to decoding time and also
suggests some modifications to the draft HEVC specification.
Presentation was announced to be made available.
A software decoder was written from scratch with the goal of achieving real-time decoding on x86 and
ARM devices, and thereby provide better estimations of run times in real-world environments. The
decoder takes advantage of SIMD technology to improve decoding times (MMX/SSE on x86, and NEON
on ARM). In its current state the following decoding rates are achieved:
 1920x1080@50fps on a single core of a Core i5 processor (Arrandale) clocked at 2.53GHz
 832x480@30fps on a single core of a Cortex A9 processor clocked at 1GHz
(no ALF, no NSQT)
MC and intra prediction on both platforms roughly half of processing time

JCTVC-G988 HEVC software player demonstration on mobile devices [K. McCann, J.


Choi, J. H. Park (Samsung)] [late]
Real-time decoding is the key feature for most applications. This document describes an HEVC software
decoder running on commercial devices such as mobile phones and x86. The software decoder is
implemented based on Samsung’s commercial media player framework, which is already loaded on
various PC and mobile devices. This HEVC software decoder on mobile devices, together with a file
container and android media player, is reported to be capable of running on current commercial devices
(the Galaxy Note and Galaxy S made by Samsung). The results are broadly consistent with the
complexity and profiling measurement using a low complexity decoder that is reported in JCTVC-G757,

Page: 214 Date Saved: 2011-12-04


providing confidence that the current HEVC design using CABAC could be commercially implemented
at an early date.

5.18 Encoder optimization

JCTVC-G156 CU Depth Pruning for Fast Coding Tree Block Decision [H. L. Tan, C. Yeo,
Y. H. Tan (I2R)]
This contribution presents a CU Depth Pruning algorithm for fast coding tree block (CTB) decision. The
proposed method attempts to terminate the CTB decision process by performing a one-level look-ahead
for the last sub-CU where possible. It is reported that the proposed method reduces encoding time by
about 8% with 0.1% Luma BD-Rate coding loss for the Random Access and Low Delay configurations.
It was remarked that we have a paramter "ECU" in the software that is also an encoding optimization
trick.
It was remarked that this may introduce some sort of assymmetry in the search. The patch is in the
contribution, which is available for further study.

JCTVC-G401 RDO with weighted distortion in HEVC [B. Li (USTC), G. J. Sullivan, J. Xu


(Microsoft)]
TBA
Weighted averaging (6+1+1) of luma and chroma. It is shown that this metric changes less dramatically
than single PSNR values when bits are shifted between luma and chroma. (Typically more bits in chroma
give benefit as it is easier to code).
Two types of tests:
 Changing the weighting in the R-D optimization
 Changing the QP for the chroma
It was noted that G005 supports the inclusion of chroma step size control in HEVC (which we may
assume is already intended to be inherited from AVC into HEVC).
It was noted that the ratio of chroma gain to luma loss is not always a constant.
Alternative ways of measuring overall fidelity were discussed. One was to combine the luma and chroma
MSEs before converting to the PSNR domain.
As averaging PSNR means geometric mean, would it be better to go use weighted MSE instead (such as
RDO over different channels is doing)? I.e. practically measuring PSNR based on Euclidean distance in
YCbCr.
It was suggested to perform further study in an AHG how to obtain objective metric evaluation.
[K. Minoo, G. Sullivan].

JCTVC-G543 Early skip detection for HEVC [Jungyoup Yang, Jaehwan Kim, Kwanghyun
Won, Hoyoung Lee, Byeungwoo Jeon (SKKU)]
In this contribution, an early detection of "skip" mode is proposed to reduce an encoding complexity of
HEVC. The proposed method is in the same spirit with the early skip detection scheme implemented in
MPEG-4 Part 10 AVC/H.264 reference SW, but slightly modified to address the different encoding
scheme of HEVC. It is reported that the proposed method reduces the encoding time by about 33% with
BD-bit rate loss of 0.45% compared to HM4.0 encoder.

Page: 215 Date Saved: 2011-12-04


Previously, a fast mode decision JCTVC-F092 was previously adopted. That can be combined with this
proposal. When combined together, the combination sped up the encoder by about a factor of two.
Asserted to be less than 10 lines of code. Can be implemented in a switchable fashion. We agreed to
adopt (not high priority).

JCTVC-G573 Cross-check of Early skip detection for HEVC with Early CU Termination
(JCTVC-G543, JCTVC-F092) [S. Lee, S. Cho, S. Park, N. Eum (ETRI)]
[late]

JCTVC-G794 Cross-check of Early skip detection for HEVC (JCTVC-G543) [Kiho Choi,
Sang-hyo Park, Euee S. Jang] [late]

JCTVC-G589 Encoder optimization: Reduced number of reference pictures for RA [P.


Wennersten, R. Sjöberg, J. Samuelsson (Ericsson)]
This contribution consists of an alternative GOP structure for random access coding. Compared to the
GOP structure currently in use for the random access JCT-VC common conditions test case, it requires
fewer pictures to be kept in the decoded picture buffer. Only the encoder configuration is changed, and
the average luma BD-rate result is reportedly −0.2% for random access high efficiency and −0.3% for
random access low complexity. Calculating YUV BD-rate as (6*Y+U+V)/8, the change is −0.2% for RA
HE and −0.4% for RA LC.
The AHG21 source code was used. Based on AHG21 software (otherwise, in HM software, the structure
is hard coded).
In spirit, we would like to use this. However, the current implementation is tangled up with AHG21 work.
Comments: The suggested structure is similar to JM and KTA. It may also improve the latency.
Conclusion: Agreed that this is desirable, but it depends on the result of AHG21 related BoG. Not high
priority in implementation.
Revisit It was suggested to discuss this again after AHG21 was resolved. Also see G157.
This topic was discussed again on Nov. 29. Decision (SW): Adopt and include in HM5 and common test
conditions.

JCTVC-G687 Fast fractional motion search with adaptive searching point reduction [Wan-
Chi Siu (HKPU), Yan-Ho Kam (HKPU), Wai-Lam Hui (HKPU), Yui-Lam
Chan (HKPU), Yu Liu (ASTRI), Jenny Yan Huo (ASTRI)]
This contribution proposes an additional option of the approach of performing fractional motion search,
aiming at reducing the computational complexity of the encoder of the current HEVC reference model.
According to the simulation results, the proposed technique achieves 6% reduction in computational
complexity with a 0.44% loss in luma BD-rate.
Half pixel search with horizontal/vertical neighbors in first step.
The change would require (probably a lot) “more than 10 lines of code”.
Not to be included.
It was commented that the speed-up is not so large, and we have other work to focus on. It may be nice to
use in practice, but we do not consider it a sufficiently high priority to put this in our software at this time.

Page: 216 Date Saved: 2011-12-04


5.19 Withdrawn

JCTVC-G114 [withdrawn]

JCTVC-G160 [withdrawn]

JCTVC-G177 [withdrawn]

JCTVC-G356 [withdrawn]

JCTVC-G368 [withdrawn]

JCTVC-G538 [withdrawn]

JCTVC-G562 [withdrawn]

JCTVC-G594 [withdrawn]

JCTVC-G595 [withdrawn]

JCTVC-G602 [withdrawn]

JCTVC-G670 [withdrawn]

JCTVC-G701 [withdrawn]

JCTVC-G866 [withdrawn]

Page: 217 Date Saved: 2011-12-04


JCTVC-G903 [withdrawn]

JCTVC-G921 [withdrawn]

JCTVC-G964 [withdrawn]

JCTVC-G969 [withdrawn]

5.20 Category not clear

JCTVC-G1000 Crosscheck [F. Bossen] [late] [miss]

5.21 Not presented due to late arrival or lack of availability

JCTVC-G1043 20 years after MPEG-2 [J. Song] [late]

6 Plenary Discussions and BoG Reports


6.1 Project development

JCTVC-G061 HEVC issues (comments from USNB of WG11) [A. G. Tescher for USNB of
WG11]
TBA
Not reviewed.Some aspects relevant to current phase of work.
 entropy coding, affirmative
 plan to do a test, affirmative
 8 b per sample in the test, affirmative
 RA HE and LB HE, affirmative
 HM 4 rather than HM 5 if necessary, affirmative (whichever works out)
Results of the test are planned to be available by the San Jose meeting.

JCTVC-G096 Items to be clarified in HEVC design [M. Zhou (TI), W. Wan (Broadcom),
T. Suzuki (Sony), A. Tabatabai (Sony)]
Items to be clarified:

Page: 218 Date Saved: 2011-12-04


Max and min LCU size: recommend max LCU = 64x64, min LCU = 16x16. To be further discussed in
AHG8 on profiles and levels.
Comment: LCU sizes beyond that range may currently not be properly supported in SW.
Comment: there may be no benefit in supporting LCU sizes larger than 64x64.
Comment: should also consider limit on SCU
Min slice granularity: recommend min slice granularity of 16x16 (luma). Should already be in WD; to be
checked. Additional limit may be imposed by level. Further discuss in AHG.
Max number of references frames: recommend to stick with AVC limits for frame pictures (16). Agree to
keep this limit in WD. May be further discussed.
4x4 inter prediction support: recommend to not include it in initial profiles. Q: coding efficiency impact,
in particular at high bit rates? Q: profile or level issue?
Agree to keep it in specification. Inclusion in given profile to be discussed in later work.
Chroma deblocking: mismatch between text (edges of size 4) and SW (edges of size 8). Recommend to
align text with SW (edges of size 8). Agree that intent is to filter chroma edges of size 8. Deblocking
experts were requested to review deblocking-related text in general with WD editors.

JCTVC-G507 Syntax Issues [K. Sato (Sony)]


3 items:
Relationship between log2_min_coding_block_size_minus3 and inter_4x4_enaled_flag in SPS
Inter_4x4_enabled_flag may be redundant in SPS under certain conditions. Suggestion to make coding of
this flag conditional. Decision: Agreed.
Relationship between slice granularity and max_cu_qp_delta_depth in PPS
Comment: max_cu_qp_delta_depth, pic_init_qp_minus26, and slice_qp_delta not [properly]
implemented in software. This should be fixed. Chono-san volunteered.
Comment: Software is reported to not work if slice granularity is larger than delta qp granularity. For
example when slices have 16x16 granularity and delta qp has 32x32 granularity.
Decision: Adopt solution A: the value of log2MinCUDQPSize shall be smaller than or equal to the CU
size specified by the value of slice granularity.
Tile boundary processing (already discussed as part of G194 review)

JCTVC-G661 Parallel partition Profile & Level limits [Chad Fogg, Aaron Wells]
Describes the need to define limits related to slices, entropy slices, tiles and wavefront.
To be further discussed in AHG8.

JCTVC-G729 Proposal to start the discussion on HEVC profile/level [T. Suzuki (Sony)]
Requested AHG to discuss profiles and levels. Done.
Should “clean up” level definitions from AVC instead of blindly copying it.
Request to thoroughly test profiles in their definition process (e.g., 4:2:2 10 bit).

JCTVC-G914 UKNB Comment on HEVC Extensions [UKNB of WG11] [late]


Not presented.

Page: 219 Date Saved: 2011-12-04


JCTVC-G1019 Samsung’s comments on HEVC Extension [K. McCann, C. Kim, B. Choi, J.
H. Park] [late]
Not presented.

JCTVC-G1004 Proposed text for some features not yet integrated into JCTVC-F803 [M
Horowitz (eBrisk)] [late]
Updated version of JCTVC-F803. This text will be incorporated into next WD version released by
editors.

JCTVC-G1000 Suggested revised common test conditions [F. Bossen] [late]


Discussion in preparation of common test conditions (2nd Wed)
Aspects discussed:
 Separate out class A 8 bit and 10 bit – agreed.
 Within class A, use 8 bit rounded version of 10 bit sequences – perhaps not.
 Include class F in overall averages? – Not include in overall average, but class F required to be
reported separately.
 The G382 non-normative quantization feature is disabled in these configs.
It was strongly suggested that having a good (but not overly complex) rate control algorithm for the HM
would be highly beneficial. Agreed.
It was suggested to provide flags in the SPS to disable the features are currently switchable only using a
#define, and for this to be a "software-only" feature. Decision (SW): Agreed.

6.2 BoGs

JCTVC-G1002 BoG report on reference picture buffering and list construction [J. Boyce,
R. Sjoberg, Y.-K. Wang]
The BoG recommended adoption of the AHG21 working draft text modification, in JCTVC-G021. It was
also recommended to have an AHG until the next meeting, and that contributions to next meeting be
based upon the editors’ WD version which includes the JCTVC-G021 modifications.
The BoG requested to meet again this meeting to define test conditions with various picture coding
structure patterns for evaluation of long term reference picture contributions, and for bit rate savings
based contribution, based on a draft test conditions document prepared by Miska Hannuksela and
circulated on the reflector by Saturday.
The BoG also recommended to further discuss contribution JCTVC-G198 with a larger group (this was
later done in Track A).
Several contributions initially assigned to this BoG were considered to not be within its scope, so it is
recommended that they be re-assigned: JCTVC-G717, JCTVC-G635, JCTVC-G157, and JCTVC-G549.
Additionally, the BoG proposed postponing consideration of JCTVC-G715 until after the AHG18
discussion.
Questions and comments:
 Are long-term reference pictures (LTRP) supported?
o Long-term pictures, in AVC, affect weighted prediction, temporal MV scaling, default
list construction, list modification, sliding window, and explicit reference picture
MMCO.

Page: 220 Date Saved: 2011-12-04


o It was remarked that for high-level syntax and functionalities such as the above, although
some things aren’t reflected yet in the working draft, such aspects are understood to be
generally inherited from AVC.
o It was reported that the intention of the BoG was to intend to have LTRPs in the design,
but that there was not yet a good understanding of how specifically to best support them.
It was agreed that LTRPs should be supported in the (final) design.
o It was suggested to adopt an intermediate interim draft shared within the AHG that
included some form of LTRP.
 Some doubt was expressed regarding whether the reference picture set (RPS) proposed approach
has been sufficiently well-studied, has sufficient merit, and is sufficiently mature.
o It was noted that the RPS approach had been the subject of active AHG discussion.
o Several participants expressed support for the proposal, in terms of loss resilience, design
simplification, and temporal scalability handling.
The BoG was requested to find a way to address the LTRP support topic as part of its activities.
The BoG reported back after Nov 24 BoG meeting. (The following discussion was chaired by J. Boyce.)
The BoG recommended adding the G788 proposal #7a long term reference picture solution to the G021
WD text. The revised G1002 document upload included revised draft text.
Some discussion about POC wrap handling for the G788. The G1002 draft was conservative about
incorporating changes.
Several participants expressed support for the G1002 draft. After some more time was given to allow
some participants to review the draft text, this was agreed to be adopted.
Decision: Adopt G021 as modified by G1002-v3, plus a bug fix of a sign error and some clarifications to
be provided as -v4.
It was noted that this design does not support marking the current picture as an LTRP. This provides less
flexibility than the AVC design and should be studied and potentially changed in the future.
Updated version of BoG report Nov. 29:
Suggested reference conditions G1036 for alternate coding structures. To be further refined in AHG21.
Recommend to use it for proposals related to reference picture list construction and marking.
Recommendation to adopt additional restriction related to CRA pictures. Working draft text attached to
BoG report. Restriction is already present in software developed by prior AHG21.
Decision: Adopt.

JCTVC-G1005 BoG report on review of non-CE deblocking filter proposals A. Norkin


This contribution summarizes activities of the break out group (BoG) on review of non-CE deblocking
filter related proposals.
The proposals are sorted into categories (although some proposals can belong to more than 1 category).
1. Parallel deblocking
2. Modifications to deblocking filter description
3. Modifications to Bs calculation process
4. Line buffer reduction
5. Deblocking filter simplifications
6. Signaling deblocking filter parameters in slice header

Page: 221 Date Saved: 2011-12-04


7. Varying QP deblocking and deblocking for IPCM blocks
See under 5.5.1.

JCTVC-G1006 BoG report on Non-CE MV Coding Proposals [B. Bross, J. Jung, S. Oudin]
See under 5.8.1.

JCTVC-G1017 BoG report on intra mode [V. Sze] [miss]


The mandate of this break out group was to look into the details of G243, analyze “evaluate text,
implementation complexity, destabilization, and amount of gain”. Take note of any non-CE
proposals which may be related to this contribution. Only contributions relating exclusively to
entropy coding of intra mode were discussed; contributions which affected pixel values (e.g.
G358 and part of G119) were not discussed. A BoG meeting session was held from 5pm-7pm on
November 25, 2011.

Tools Changes/Complexity Coding Alternative Tools


Gains* G153 G359 G423
- -0.3Y 0.2U/0.3V
0.2Y
+2 MPRM -Compute four additional MPRM via LUT or −0.3 (Y) X** X***
operations & comparison for angle conversion
-Additional 3(?) comparisons to select 2 MPRM
-Additional bin (mprm_idx)
Remainder -variable registers that need to be updated for −0.2 (Y) X X
coding rank
with mode -rank mapped to exp-golomb rather than FLC
ranking -bypass coding of mprm_idx and remainder
Chroma - replace ver/hor/DC with luma+/-1 and perp −0.3 (U) X
mode (computed with LUT or operations & −0.4 (V)
coding comparison for angle conversion)
additional Additional mode. N/A X X
19th mode
Intra_4x4
* = estimated from Appendix B
** = treats first remaining mode as MPM; additional bin
*** = no additional MPM, but requires angle conversion
X = denotes that proposal provides an alternative to tool in G243.

There was discussion of the [qq]


No action.
Further CE work should be focused on cleanup/simplification, and forget about one or two tenths
of a percent in coding gain.

Annex A contains a review of intra mode coding proposals (blue = exactly the same, yellow =
basically the same as each other). Section 4.1 identifies overlapping ideas.
G145 proposes LCU boundary line buffer compression (0.0) or elimination (0.1 loss)
Page: 222 Date Saved: 2011-12-04
Decision: Eliminate the line buffer as proposed in G145
G184 proposes to change neighbors for consistency with how inter works
No action.
G423 on chroma coding
For further study
G707=G767 (part of G153), increased us of bypass coding
Decision: Adopt G707=G767.
Remark: Right now the bins for the remainder are coded starting from the LSB; let’s start
from the MSB. (Considering that it is coded in bypass mode, no effect on perf.)
Decision: Agreed.
G418&G109&G144 (part of G119), simplifying the case when neighbour PU size differs
Decision: Adopted.
Additional further aspect of G119 to default to planar if neighbours not intra or unavailable
Question: Does it affect non-intra coding efficiency? Probably not – and verbally reported
(by a non-proponent) not to (or to actually provide a little gain).
Decision: Adopted (part 2 of G119 as described above).
Suggestion: Note that currently at 64x64 level, only 4 modes are available, making the
parsing different than at lower sizes, and the prediction actually operates at the 32x32 level,
so there is no real benefit for the restriction (aside from trivial overhead reduction).
Decision: Agreed to support all 32x32 modes to be signalled at the 64x64 level.

This closes all of intra mode coding (in regard to contribs that only affect the coding of the
modes, not the values of the samples).

JCTVC-G1016 BoG report on APS matters [S. Wenger]


A break-out group meeting was held Saturday, Nov. 26 2011 from 8:00 through 10:10.
Approximately 20 delegates were present. The BoG reviewed and summarized input documents
G122, G220, G295, G330, G332, G566, G658, and discussed related issues.
Based on contributions G122 and G220, the BoG recommended removing CABAC usage from
the APS. The removal of any byte alignment syntax elements in the APS was also recommended,
as they did not appear to be needed anymore. WD text is included in G122.
Decision: Adopted (adjusted as necessary to remove remaining byte alignment syntax and to
include an extension flag and a while loop of extra data bits and RBSP trailing bits – as
necessary in all three parameter sets).
The BoG further recommended to henceforth not include any CABAC coded syntax elements
“above” the slice header (inclusive).
In further discussion in JCT-VC (Track B) this was agreed to be the plan (unless some
exceptional justification arrives for something in the future) because header-level parsing may
often be performed by software on low-capability processors.
Page: 223 Date Saved: 2011-12-04
The BoG reported that we have an issue that may be referred to as “partial update”, that was
asserted to need to be addressed at some point in time. “Partial update” refers to the idea that
from one picture to another picture, parts of the information carried in APS would be able to
change without transmission of a full APS carrying both the changed and unchanged
information. There are two very similar proposals addressing this issue at this meeting (G295
and G332), both of which rely on multiple APS references in the slice header. The proponents of
G295 and G332 agreed to prepare a combined proposal for consideration by JCT-VC. It was also
remarked that document E309 addresses the same problem in a different way, avoiding multiple
references to parameter sets in the slice header. It was further remarked that a solution according
to G295/G332 could co-exist with a solution based on E309, as they have different trade-offs in
terms of where the bits are spent (slice header vs. additional NAL units). The BoG did not agree
on a solution for this problem, but recommended the JCT-VC to consider a unified proposal at
this meeting.
JCTVC-G1026 was uploaded in response to this. See notes above for JCTVC-G1026.
The BoG recommended to JCT-VC to include a tool that allows MTU size matching (akin to
slices) for parameter sets and especially for the APS, if there is an expectation that a parameter
set may become larger than commonly used MTU sizes. It was not clear to the BoG participants
whether we have this problem at this point in time, and the BoG indicated that it would be good
if this can be clarified by the larger group.
In subsequent discussion in JCT-VC (Track B), it was agreed that a large APS seems like a
possibility, and further study was encouraged regarding how to address this. G658 was identified
as a contribution on this subject that is available for such study and further work.
The group discussed the potentially different error resilience properties of APS and other
parameter sets, and to what extent that should affect the selection of data to be carried in the
APS. It was reported that, due to its strict timing requirements, APSs are likely to be conveyed
in-band and synchronous to the rest of the bitstream (at least at picture and perhaps even at slice
level), and, therefore, are less easily protectable than other parameters sets. Conceivably, this
could be a reason to limit the data included into an APS to such data whose loss is not
catastrophic to the decoding process. However, the BoG came to the conclusion that this
argument is not very compelling. First, from a compliance viewpoint, the APS includes data that
is required for the decoding process. Decoding data in the absence of the correct APS is a non-
conforming operation, with all its consequences. Second, it is very hard to define “catastrophic”.
ALF parameters were claimed to be one example where a loss is not catastrophic—if you lose
those, you would still get a reasonably displayable picture from the decoder. However, it was
reportedly probably not difficult to design an “evil” bitstream which includes ALF parameters
that render that bitstream useless if ALF not applied properly. Accordingly, the BoG
recommended to work under the assumption that an APS is available at the decoding time of a
slice that references that APS, and to expect system layer support to ensure sufficiently reliable
transport of the APS for the application in question.
It was agreed in the BoG and confirmed by the JCT-VC (Track B) that the APS is the right place
for quantization matrices, as proposed in G295, G330, and G658, while keeping in mind the need
to consider the “partial update” issue.
It was suggested by the BoG that, for the RPS related syntax elements, what has been planned to
be in the PPS should stay in the PPS, and what is currently in the slice header should be moved
to the APS. Y.-K. Wang agreed to write drop-in text for the larger group and the editors, which
was included in the zip file of the BoG report (v2).

Page: 224 Date Saved: 2011-12-04


After discussion of the BoG report in the JCT-VC (Track B), it was agreed to not act
immediately on this suggestion, but rather to study the subject further before taking this step.
So a summary of the content of the APS as planned is (at the time of this discussion at the
meeting) to contain the following:
 ALF and SAO parameters (modified to remove CABAC encoding)
 Quantization weighting matrix data – Decision: Agreed.
 Reserved extension data and RBSP trailing bits
Text reflecting this will be uploaded in a -v3 of the BoG report.
After discussion of G566, the BoG recommended to reinserted enabling/disabling ALF and SAO
flags in the slice header, in order to remove a parsing dependency between the APS and slice
header. Decision: Agreed.
The BoG noted the possible consequence of allowing different values in the APS and slice
header, with a suggested restriction that if the flag is "0" in APS, it cannot be "1" in the slice
header.
The JCT-VC (Track B) discussed also the possibility of allowing different flag values in
different slices of the same picture. This topic was initially left open for further discussion.
Pending a decision otherwise, the slice-level flags must be static within a picture and must equal
the APS flag and all slices of a picture must refer to the same APS.
In further discussion, it was noted that ALF, when enabled, flags in the slice header that can
disable ALF on a fine level of granularity. Allowing the general enabling flag to be zero when
the APS-level flag is 1 would be a shortcut with the same effect and better coding efficiency.
Therefore, we should allow this.
For SAO, the map control does not in general allow alignment with the boundaries of a slice, and
the map control is not sent within the slice header. Therefore, we should not (currently) try to
move the control to the slice level.
Decision: Agreed to allow this flexibility of flag value for ALF but not SAO.

Regarding definition of nal_unit_type value for APS? Decision: Use 14 (previously reserved).

JCTVC-G1020 BoG report on loop-filtering (AHG6) [T. Yamakage] [late]


See under 5.5.2–5.5.4.

JCTVC-G1025 BoG report on tiles and wavefront parallel processing [M. Horowitz]
This report contains the summary of proceedings of the tiles and wavefront parallel processing BoG
meeting. The BoG was created to review contributions related to tiles and wavefront parallel processing.
The meeting was held on Friday evening, 18:00 to 22:15 and Saturday morning, 10:00 to 12:15,
November 25 and 26, respectively. Approximately 30 delegates attended.
The BoG reviewed and summarized input documents: G183, G968, G194, G802, G197, G961, G315,
G317, G318, G453, G454, G618, G199, G627, G722, and G815 and prepared associated
recommendations.
The BoG recommended adoption of G194, AHG4: Non-cross-tiles loop filtering for independent tiles.
G194 proposes a flag to indicate enabling or disabling in-loop filtering across independent tile
boundaries. The current HM design always filters across independent tile boundaries.

Page: 225 Date Saved: 2011-12-04


In Track B, this recommendation was accepted as noted above for G194.
The BoG recommended adoption of G197, AHG4: Low latency CABAC initialization for dependent tiles.
To facilitate low-delay (in the wavefront sense from the decoder perspective), G197 proposes that the
CABAC probabilities of the first LCU in each tile are inherited from those of the adjacent left LCU if it is
within the same slice. In the current HM design, CABAC probabilities are always inherited from the last
LCU, in tile scan order, in the same slice.
In Track B, this recommendation was accepted as noted above for G197.
In addition, the group recommended that the G197 be further tested to confirm that it has an advantage
over the use of entropy slices that explicitly initialize CABAC (from the improved fixed initialization
tables that resulted from CE 1).
The BoG recommended adoption of two elements of G315, AHG4: Unification of picture partitioning
schemes.
The first of these recommendations is to move the slice address to the beginning of the slice header.
Decision: Adopted.
The second recommendation is to make the presence of cabac_init_idc syntax element independent of
slice type. When the slice type is I, then the syntax element value was recommended to be required to be
zero.
In discussion of this, it was noted that cabac_init_idc is not actually in the slice header of the WD 4 draft.
This was apparently to initialize CABAC for decoding the ALF parameters. Now that there is no CABAC
encoded content in the APS anymore, we can move cabac_init_idc back to the slice header.
Decision: Put slice_type and cabac_init_idc into the slice header and entropy slice header.
In addition to the above recommendations for adoption, the BoG recommends that the ideas in G618,
“Line buffers problem in HEVC”, be considered in CE work similar to the prior CE 12.3 and 8.c (as an
additional method to test). G618 proposes a tiles-based solution to reduce a line-buffer problem identified
in in-loop filtering processes in the current HEVC design. In addition, level constraints on the maximum
tile width are proposed.
The BoG recommended that if time permits, G722, "Harmonization of entry points for tiles and wavefront
processing", be discussed after data becomes available (likely Monday) and that there is a related element
of G315 "Unification of picture partitioning schemes".
This work was then planned for AHG activity.
Finally, the BoG reported that contribution G454, Parallel processing of ALF and SAO for tiles,
identified a potential issue in the current HEVC design which allows one slice to reference only one APS.
It was reported that in some cases, it may be advantageous to allow more than one APS to be referenced
in a slice (e.g., a picture partitioned into 1 slice and 2 tiles and each tile would ideally reference a different
APS). It was commented that the APS BoG is aware of this issue. See notes on this topic elsewhere.

JCTVC-G1032 BOG report on intra prediction complexity reduction and filtering [R.
Joshi] [miss]
Break-out group meetings for intra prediction complexity and filtering were held on Saturday, Sunday
and Monday, Nov. 24–26, 2011. Approximately 25 delegates were present.
Regarding prediction modes supported at the 64x64 level, this was resolved as recorded in notes for
G1017.
The BoG recommended that some discussed modifications of DC prediction relating to G567 be studied
in a CE. In Track B, this was discussed, but it did not seem worth pursuing as a CE.

Page: 226 Date Saved: 2011-12-04


The BoG recommended that part 3 of G119 be tested in a CE, and this was generally supported. It was
commented that other proposals may address the same issue.

JCTVC-G1034 BoG report on chroma intra prediction [Ali Tabatabai]


qq

JCTVC-G1035 BoG report on resolving deblocking filter description [A. Norkin et al.]
qq

JCTVC-G1038 JCT-VC break-out report: Harmonization of NSQT with residual coding


[J. Sole (Qualcomm)]
qq

JCTVC-G1039 BoG report on chroma formats [David Flynn (BBC)]


BoG recommendation: Adopt SPS syntax for chroma_format_idc (same as AVC). Further discuss
separate_colour_plane_flag.
Decision: Agreed
Issue with subsample position of chroma (in particular w.r.t. LM chroma mode). To be further discussed
in relevant AHG (20?).
Recommending further study on native 4:4:4 coding vs 4:2:0 with normative upsampling filters.
Issue of 4:2:2 and 4:4:4 content availability: to be further discussed in AHG11. Offers made verbally at
meeting.qq

JCTVC-G1041 BoG report on subjective viewing test for deblocking filter proposals [A.
Norkin, M. Narroschke, K. Andersson, D. Flynn, X. Guo, G. v. d. Auwera]
qq

JCTVC-G1034 BoG report on chroma intra prediction [A. Tabatabai]


The relevant contributions were reviewed.
The BoG recommended the following actions:
 A first element of JCTVC-G419, which is WD text modifications for harmonization of WD and
HM5.0 SW for LM (text to be provided as part of the BoG report). Decision: Agreed.
 To further study in a CE, the coding efficiency improvement proposals JCTVC-G173, JCTVC-
G244, JCTVC-G346, and JCTVC-G358.
Page: 227 Date Saved: 2011-12-04
 To also further study JCTVC-G245, and the issue that it raises regarding phase alignment for
chroma. The study of this may involve investigation of workflows and collection of example
source material. The contributor generated some example source material by phase-adjusted
subsampling of some 4:4:4 video sequences. This should be studied in the AHG on chroma
formats.

JCTVC-G1038 BoG report on NSQT harmonization [J. Sole]


A break-out group meeting was held Monday Nov. 28 2011 from 11:00 through 13:00. Approximately 20
delegates were present.
The BoG studied the interaction between the contexts for residual coding in JCTVC-G1015 and the sub-
block scans in JCTVC-G323 with NSQT.
And analyzed possible issues (including transpose blocks and remapping)
Four proposals on the topic were discussed: G123, G517, G724, G750.
NSQT is currently doing a remapping.
With respect to the current remapping, the BoG indicated that:
 Discussion of direct integration of G1015 with the current design HM4.0 indicated that it appears
to have no additional issues with respect to NSQT (interaction with 8x8 and 16x16/32x32).
 Discussion of direct integration of G323 with the current design HM4.0 indicated that it appears
to have no additional issues with respect to NSQT (no interaction with 16x4 and 4x16, interaction
only in the 32x8 and 8x32 that map to the 16x16)
 There was agreement that the remapping from non-square to square blocks should be removed.
 It was reported that proposals (G123, G517, G724, G750) removing remapping have interaction
with G1015 (interaction with 8x8 and 16x16/32x32) and G323.
 It was recommended that a CE be established on NSQT harmonization with entropy coding.
 Confirmation was requested of whether "bugfix 214" will be in HM 5.0.
It was remarked that "bugfix 214" is a technical change that may not really be a bug. Contribution G517
had commented on this aspect, and had not yet been reviewed. This aspect was discussed in the Track B
review of the BoG output. It was described as a mismatch between the WD and software.
The software currently performs and inverse transform of the longer direction first, regardless of
whether that is within columns or within rows.
The proposal is for the inverse transform to always apply to columns first and then rows.
Decision: Agreed (columns first and then rows).
It was suggested that G269 and G354 have a relationship to this work.
It was suggested that we should put a "straw man" scheme in the draft now that is consistent with G323.
A scheme was described verbally and was requested to be documented and uploaded in a revision of the
BoG report. This was provided. Decision: Adopted.

JCTVC-G1044 BoG report on Quantization M. Budagavi, L. Kerofsky

Page: 228 Date Saved: 2011-12-04


7 Project planning
7.1 WD drafting and software

JCTVC-G777 The Art of Writing Standards: Some Shalls and Shoulds for Better Quality
Interop Specs [G. J. Sullivan (Microsoft)]

7.2 Plans for improved efficiency and contribution consideration


The group considered it important to have the full design of proposals documented to enable proper study.
Adoptions need to be based on properly drafted working draft text (on normative elements) and HM
encoder algorithm descriptions – relative to the existing drafts. Proposal contributions should also provide
a software implementation (or at least such software should be made available for study and testing by
other participants at the meeting, and software must be made available to cross-checkers in CEs).
Suggestions for future meetings included the following generally-supported principles:
 No review of normative contributions without WD text
 HM text strongly encouraged for non-normative contributions
 Early upload deadline to enable substantial study prior to the meeting
 Using a clock timer to ensure efficient proposal presentations (5 min) and discussions

The document upload deadline for the next meeting was planned to be 8 Nov20 Jan. 20121.
As general guidance, it was suggested to avoid usage of company names in document titles, software
modules etc., and not to describe a technology by using a company name. Also, core experiment
responsibility descriptions should name individuals, not companies. AHG reports and CE
descriptions/summaries are considered to be the contributions of individuals, not companies.

7.3 General issues for CEs


Because a draft design and HEVC test model (referred to as the HM) have now been established, group
coordinated experiments are now referred to as "core experiments" rather than "tool experiments".
A preliminary CE description is to be approved at the meeting at which the CE plan is established.
It is possible to define sub-experiments within particular CEs, for example designated as CEX.a, CEX.b,
etc., for a CEX, where X is the basic CE number.
As a general rule, it was agreed that each CE should be run under the same testing conditions using one
software codebase, which should be based on the HM software codebase. An experiment is not to be
established as a CE unless there is access given to the participants in (any part of) the CE to the software
used to perform the experiments.
The general agreed common conditions for experiments were described in the output document JCTVC-
F900.
A deadline of four three weeks after the meeting was established for organizations to express their interest
in participating in a CE to the CE coordinators and for finalization of the CE descriptions by the CE
coordinator with the assistance and consensus of the CE participants.
Any change in the scope of what technology will be tested in a CE, beyond what is recorded in the
meeting notes, requires discussion on the general JCT-VC reflector.

Page: 229 Date Saved: 2011-12-04


As a general rule, all CEs are expected to include software available to all participants of the CE, with
software to be provided within three two (calendar) weeks after the release of the HM 45.0 software basis.
Exceptions must be justified, discussed on the general JCT-VC reflector, and recorded in the abstract of
the summary report.
Final CEs shall clearly describe specific tests to be performed, not describe vague activities. Activities of
a less specific nature are delegated to Ad Hoc Groups rather than designated as CEs.
Experiment descriptions should be written in a way such that it is understood as a JCT-VC output
document (written from an objective "third party perspective", not a company proponent perspective –
e.g. referring to methods as "improved", "optimized" etc.). The experiment descriptions should generally
not express opinions or suggest conclusions – rather, they should just describe what technology will be
tested, how it will be tested, who will participate, etc. Responsibilities for contributions to CE work
should identify individuals in addition to company names.
CE descriptions should not contain verbose descriptions of a technology (at least not unless the
technology is not adequately documented elsewhere). Instead, the CE descriptions should refer to the
relevant proposal contributions for any necessary further detail. However, the complete detail of what
technology will be tested must be available – either in the CE description itself or in referenced
documents that are also available in the JCT-VC document archive.
Those who proposed technology in the respective context (by this or the previous meeting) can propose a
CE or CE sub-experiment. Harmonizations of multiple such proposals and minor refinements of proposed
technology may also be considered. Other subjects would not be designated as CEs.
Any technology must have at least one cross-check partner to establish a CE – a single proponent is not
enough. It is highly desirable have more than just one proponent and one cross-checker.
It is strongly recommended to plan resources carefully and not waste time on technology that may have
little or no apparent benefit – it is also within the responsibility of the CE coordinator to take care of this.
A summary report written by the coordinator (with the assistance of the participants) is expected to be
provided to the subsequent meeting. The review of the status of the work on the CE at the meeting is
expected to rely heavily on the summary report, so it is important for that report to be well-prepared,
thorough, and objective.
Non-final CE plan documents were reviewed and given tentative approval during the meeting (in some
cases with guidance expressed to suggest modifications to be made in a subsequent revision). [clean up
notes]
The CE description for each planned CE is described in an associated output document JCTVC-F9xx for
CExx, where "xx" is the CE number (xx = 01, 02, etc.). Final CE plans are recorded as revisions of these
documents.
It must be understood that the JCT-VC is not obliged to consider the test methodology or outcome of a
CE as being adequate. Good results from a CE do not impose an obligation on the group to accept the
result (e.g., if the expert judgment of the group is that further data is needed or that the test methodology
was flawed).
Some agreements relating to CE activities were established as follows:
- Only qualified JCT-VC members can participate in a CE
- Participation in a CE is possible without a commitment of submitting an input document to the
next meeting.
- All software, results, documents produced in the CE should be announced and made available to
all CE participants in a timely manner.

Page: 230 Date Saved: 2011-12-04


7.4 Alternative procedure for handling complicated feature adoptions
The following alternative procedure had been approved at the preceding meeting as a method to be
applied for more complicated feature adoptions:
1. Run CE + provide software + text, then, if successful,
2. Adopt into HM, including refinements of software and text (both normative & non-normative);
then, if successful,
3. Adopt into WD and common conditions.
Of course, we have the freedom (e.g. for simple things) to skip step 2.

7.5 Common Conditions for HEVC Coding Experiments


Preferred Common Conditions for experiment testing that are intended to be appropriate for both CEs and
other experiments were selected by the group and described in output document JCTVC-F900.

7.6 Software development


The software coordinator had already started integrating bug fixes on top of HM 3.3 software, and had
strongly recommended for proponents of adopted proposals to re-implement them in HM 3.3 and test in
this environment before integrating them into HM 4.x. All tools were planned to again be thoroughly
tested after integration in HM 4.x.
HM 4.0 should be available within 4 weeks after the meeting, and will be used for CEs. HM 4.1 is
planned to be available 3 weeks later.

7.7 Subjective test plan for design


Plan a subjective test enabling results at the Feb meeting.
Plans for subjective testing: Preparation of visual comparison for the February meeting will need to be
discussed in the next meeting. Initial idea: Use a set of rates e.g. as from CfP, and run HM and JM on
same conditions. It may not be necessary to run all different constraint cases, RA may be sufficient.
Need to prepare JM & HM encodings, roughly as was done for the CfP.

8 Establishment of ad hoc groups (tbd)


The ad hoc groups established to progress work on particular subject areas until the next meeting are
described in the table below. The discussion list for all of these ad hoc groups will be the main JCT-VC
reflector (jct-vc@lists.rwth-aachen.de).

Title and Email Reflector Chairs Mtg


JCT-VC project management (AHG1) G. J. Sullivan, J.-R. Ohm (co- N
chairs)
(jct-vc@lists.rwth-aachen.de)
 Coordinate overall JCT-VC interim efforts
 Report on project status to JCT-VC reflector
 Provide report to next meeting on project coordination
status

Page: 231 Date Saved: 2011-12-04


HEVC Draft and Test Model editing (AHG2) B. Bross, K. McCann (co- N
chairs), W.-J. Han, I.-K. Kim,
(jct-vc@lists.rwth-aachen.de)
J.--R. Ohm, S. Sekiguchi,
 Produce and finalize JCTVC-G1102 HEVC Test Model 5 G. J. Sullivan, T. Wiegand
(HM 5) Encoder Description (vice chairs)
 Produce and finalize JCTVC-G1103 HEVC text
specification Working Draft 5
 Produce and finalize JCTVC-F802 HEVC Test Model 4
(HM 4) Encoder Description
 Produce and finalize JCTVC-F803 HEVC text
specification Working Draft 4
 Gather and address comments for refinement of these
documents
 Coordinate with the Software development and HM
software technical evaluation AhG to address issues
relating to mismatches between software and text
Software development and HM software technical F. Bossen (chair), N
evaluation (AHG3) D. Flynn, K. Sühring
(vice chairs)
(jct-vc@lists.rwth-aachen.de)
 Coordinate development of the HM software and its
distribution to JCT-VC members
 Produce documentation of software usage for distribution
with the software
 Prepare and deliver HM 54.0 software version and the
reference configuration encodings according to JCTVC-
G12F900 based on common conditions suitable for use in
most core experiments (expected within four 2.5 weeks
after the meeting).
 Prepare and deliver HM 54.1 software (and additional
"dot" version software releases as appropriate) and
appropriate software branches that include additional
items not integrated into the 54.0 version (expected
within three weeks after the 54.0 software release).
 Perform analysis and reconfirmation checks of the
behaviour of technical changes adopted into the draft
design, and report the results of such analysis.
 Suggest configuration files for additional testing of tools.
 Coordinate with HEVC Draft and Test Model editing
AhG to identify any mismatches between software and
text

Page: 232 Date Saved: 2011-12-04


Picture Partitioning and LCU scan order (AHG4) R. Sjöberg (chair), N
Y. Chen, F. Henry, M. Horowitz,
(jct-vc@lists.rwth-aachen.de)
K. Kazui, A. Segall, W. Wan
 Study the interactions and combinations of picture (vice chairs)
partitioning and LCU scan order
 Study technical proposals related to picture partitioning
and alternative LCU scan processing including slices,
tiles and wavefront
 Identify and work to resolve issues relating to the draft
text description of picture partitioning and LCU scan
order, and the associated reference softwareHM
functionality
 Identify and discuss additional issues relating to picture
partitioning and LCU scan orderStudy interactions and
combinations of picture partitioning and LCU scan order
related technical proposals
 Study the coding efficiency and loss resilience impact of
picture partitioning and LCU scan order
 Study the use of picture partitioning and LCU scan order
for high-level parallelism
 Study the use of picture partitioning and LCU scan order
for ultra low delay
 Identify and discuss additional issues relating picture
partitioning and LCU scan order
Spatial transforms (AHG5) P. Topiwala (chair), N
M. Budagavi, R. Cohen, R. Joshi
(jct-vc@lists.rwth-aachen.de)
(vice chairs)
 Study and discuss the ("core" and additional) transforms.
 Discuss the transform-related Core Experiment (CE7)
along with additional transform-related proposals, and
identify potential synergies or incompatibilities related to
the core transform tools being investigated.
 in the HM design, including compression performance,
computational complexity, dynamic range, clipping,
storage requirements, etc.
 Discuss transform-related Core Experiments, and identify
potential synergies or incompatibilities related to the tools
being tested in the CEs.
 Report the results and conclusions of these studies,
discussions and experiments to the JCT-VC.

Page: 233 Date Saved: 2011-12-04


In-loop and post-processing filtering (AHG6) T. Yamakage (chair), N
K. Chono, Y. J. Chiu,
(jct-vc@lists.rwth-aachen.de)
I. S. Chong, M. Narroschke,
 Study simplification and harmonization of in-loop A. Norkin (vice chairs)
filtering technologies
 Study the relationship between IPCM and deblocking
filtering behavior
 Clean up and stabilize the HM software, the WD text and
the HM encoder description on non-deblocking in-loop
filteringStudy enhancement schemes of in-loop filtering,
including de-blocking/de-banding/de-noising filters, and
adaptive Wiener-based filters including variants with
various inputs, and combination of filters
 Study trade-offs and characteristics of filter designs
including complexity and subjective and objective
performance
 Discuss relationships and evaluation procedures for the
filtering techniques
 Identify possibilities for harmonization of enhanced in-
loop filtering technologies
 Study the relationship between in-loop and post-
processing filtering
Transform dynamic range (AHG7)Memory bandwidth E. Alshina, A. SegallT. Suzuki N
restrictions in motion compensation (AHG7) (co-chairs)
(jct-vc@lists.rwth-aachen.de)
 Study memory bandwidth in motion compensation.
 Study restrictions to reduce the average and worst case of
memory bandwidth.
 Study level specific restrictions, for example restrict
usage of small PUs, restrict usage of 2D interpolation,
restrict subpel accuracy etc, to reduce memory
bandwidth in motion compensation.
 Evaluate the impact of such restrictions on coding
efficiency.Study methods for managing transform
dynamic range
 Find solutions guaranteeing 16 bit buffers in inverse
transform without overflow and with minimal clipping
operations
 Study hardware and software aspects of inverse transform
implementations

Page: 234 Date Saved: 2011-12-04


Reference pictures memory compression (AHG8)Profile M. Horowitz, K. McCann (co- N
and /level definitions (AHG8) chairs), T. Suzuki, T. K. Tan,
W. Wan, Y.-K. Wang
(jct-vc@lists.rwth-aachen.de)
(vice chairs)K. Chono (chair),
 Study potential architectures of profiles that could be T. Chujoh, D. Hoang, C. S. Lim,
used to define operation points with trade-off between A. Tabatabai, M. Zhou
complexity and compression efficiency, facilitating (vice chairs)
maximum interoperability.
 Study the potential architecture of profiles that could be
used to facilitate a trade-off between complexity and
compression efficiency, with particular reference to an
“onion-ring” approach.
 Study a potential set of levels to span the target range of
applications with a reasonable degree of granularity
 Investigate the feasibility of meeting the vast majority of
industry requirements for complexity / compression
efficiency trade-off in the target 8-bit 4:2:0 video
applications with a single profile.
 Identify potential clustering of coding tools and
constraints in terms of coding efficiency vs. complexity
trade-off for the majority of applications.
 Evaluate the coding tools identified in the previous
mandate to help assist profiling decisions by JCT-
VC.Study motion compensation memory access
bandwidth of HM design and proposed reference picture
memory compression schemes
 Study reference picture memory compression schemes
and other motion compensation memory access reduction
schemes
 Study data format alignment between reference picture
memory compression and display processing
 Study the visual quality impact of reference picture
memory compression
 Report on conclusions reached

Page: 235 Date Saved: 2011-12-04


Entropy coding architecture (AHG9) K.  McCann (primary), A. N
Fuldseth, J. Lainema, D. Marpe,
(jct-vc@lists.rwth-aachen.de)
A. Segall, K. Sugimoto, V. Sze,
 Study the entropy coding complexity and compression W. Wan, X. Wang (vice chairs)
characteristics of single entropy coder architectures,
scalable entropy coder architectures and switchable
entropy coder architectures
 Consider the potential for introducing scalability in the
entropy coding options in the current HM and other
potential means ofharmonization of HE and LC entropy
coding designs
 Characterize entropy coding throughput, memory, silicon
area, power requirements, etc.
 Study interdependencies between entropy coding and
other processes and the consequences of these
interdependencies
 Define practical use cases and corresponding HM
configurations to enable testing of different entropy
coding architectures in relevant environments
 Study and develop approaches for hardware and software
evaluation of entropy coding methods
 Identify and discuss additional issues on entropy coding

Page: 236 Date Saved: 2011-12-04


Parallel merge/skip (AHG10) M. Zhou (chair), H. Y. Kim, N
P. Onno, X. Wen
(jct-vc@lists.rwth-aachen.de)
(vice chairs)M. Budagavi
 Develop common understanding of underlying motion (chair), M. Karczewicz,
estimation throughput issues due to sequential merge G. Martin-Cocher, K. Sato (vice-
mode chairs)
 Evaluate parallel merge/skip solutions including
configurable and CU-based approaches
 Work on alternative solutions that would narrow the
performance gap between parallel and non-parallel merge
 Report the results of these studies, discussions and
experiments to the JCT-VC
 Recommend a unified solution if available
Quantization (AHG10)
 (jct-vc@lists.rwth-aachen.de)
 Study quantization issues in the HM design, including
step size control, RDOQ, etc.
 Study trade offs and characteristics of quantization
design, including coding efficiency and complexity
 Study the impact of quantization on subjective quality
 Study proposed quantization schemes such as
adaptive quantization level (AQL), quantization
matrix support, and adaptive reconstruction offsets,
and their effects
 Study adequacy of current mapping of QP to
quantizer step-size for rate control at different coding
levels (LCU, slice, frame, etc.)
Video test material selection (AHG11) T. Suzuki (chair) N
(jct-vc@lists.rwth-aachen.de)
 Identify, collect, and make available a variety of video
sequence test material
 Study the coding performance and characteristics of test
materials
 Identify and recommend appropriate test materials and
corresponding test conditions for use in HEVC
development

Page: 237 Date Saved: 2011-12-04


Objective quality metric and alternative methods for D. AlfonsoG. Sullivan, N
measuring coding efficiency (AHG12)Complexity K. Minoo, G. Sullivan (co-
assessment (AHG12)Alternative methods of measuring chairs),
coding efficiency J. Ridge, X. Wen (vice chairs)
(jct-vc@lists.rwth-aachen.de)
 Study and report suggested methods for objective quality
assessment of reconstructed video sequences (for
purposes of HEVC coding experiment evaluations) as a
single value.
 Identify problems and possible solutions for extending
any of proposed objective quality metrics to color spaces
other than 4:2:0 YCbCr. (e.g. 4:4:4 RGB)
 Study methods to compare and report the overall coding
efficiency of coding tools as a single number which
captures overall rate and overall quality (over all color
components) of a sequence.
 Identify associated changes for HM software and Excel
template (used for reporting simulation results) to support
a recommended new methodthe suggested
methods.Summarize and evaluate the various complexity
assessment methods with regard to:
 computational complexity,
 parallelism,
 memory bandwidth,
 memory capacity,
 dynamic range requirements,
 any other aspects of complexity identified as being of
interest.
 Develop and propose a set of general measurement
metrics.
 Identify criteria to determine the hardware
implementability of key hardware modules.
 Identify bottlenecks in the current design with regard to
implementation complexity.

Page: 238 Date Saved: 2011-12-04


Screen content coding (AHG13)Interlace indication (K. K. Chono (chair), C. Fogg, N
Chono) (AHG13) K. Sato, S. Sekiguchi, W. Wan
(vice chairs)O. Au (chair), J. Xu,
(jct-vc@lists.rwth-aachen.de)
H. Yu (vice chairs)
 Develop candidate text for an interlaced format indicator,
starting from a VUI flag signaling the presence of an SEI
message.To coordinate the submission, evaluation and
selection of "screen content" video test material
 To study characteristics of screen content video
 Analyze the effects of existing and proposed coding
technology on screen content video
 To study and establish evaluation methods, test
conditions, and metrics for coding of screen content video
 Study technology that may be particularly well suited to
the coding of screen content video
 Study use cases in which screen content video is
prevalent and identify potential associated technical
implications

Page: 239 Date Saved: 2011-12-04


Loss robustness (AHG14) S. Wenger (chair), M. Coban, N
Y. W. Huang, P. Onno, Y.- K. 
(jct-vc@lists.rwth-aachen.de)
Wang, J. Xu (vice chairs)
 Create and/or maintain tools to test loss robustness
including error patterns and a loss simulator
 Identify techniques and conditions for testing the loss
robustness of the design.
 Study the degree of loss robustness of the HM design and
identify deficiencies
 Identify and study the interdependencies in the HM
design in relation to loss robustness, and the potential
consequences of these interdependencies
 Investigate solutions to improve loss robustness
 Investigate the trade-off between coding efficiency and
loss robustness
 Discuss related Core Experiments, and identify potential
synergies or incompatibilities related to the tools being
tested in the CEsIdentify techniques and conditions for
testing the loss robustness of the design.
 Study the degree of loss robustness of the HM design and
identify deficiencies
 Identify and study the interdependencies in the HM
design in relation to loss robustness, and the potential
consequences of these interdependencies
 Identify techniques and conditions for testing the loss
robustness of the design.
 Investigate solutions to improve loss robustness
 Investigate the trade-off between coding efficiency and
loss robustness
 Discuss related Core Experiments, and identify potential
synergies or incompatibilities related to the tools being
tested in the CEs

Page: 240 Date Saved: 2011-12-04


High-level syntax (AHG15) Y. K. Wang (chair), J. Boyce, N
Y. Chen, M. Hannuksela,
(jct-vc@lists.rwth-aachen.de)
K. Kazui, T. Schierl, R. Sjöberg,
 Study NAL unit header, sequence parameter set, picture T. K. Tan W. Wan, P. Wu
parameter set, adaptation parameter set, and slice header (vice chairs)
syntax designs
 Study the partial update issue associated with adaptation
parameter set
 Study and identify needs for SEI messages and VUI
 Study the hypothetical reference decoder (HRD)
behaviour, including the need and feasibility of sub-
picture based HRD operations
 Assist in software development and text drafting for the
high-level syntax in the HEVC designStudy NAL unit
header, sequence parameter set, picture parameter set,
proposed additional types of parameter sets, and slice
header syntax designs
 Study and identify needs for SEI messages and VUI
 Study possible improvements to the reference picture list
construction processes
 Study possible simplifications and improvements to
reference picture marking process (e.g., the need of the
processes for generating and handling of "non-existing"
pictures)
 Study the hypothetical reference decoder behaviour
 Assist in software development and text drafting for the
high-level syntax in the HEVC design
Unification of NSQT and SDIPPadding process (AHG16) X. Zheng V. Wahadaniah N
(chair), J. Xu, X. Wang,
(jct-vc@lists.rwth-aachen.de)
A. Tabatabai, J. Lim
 Implement and harmonize the non-square partitioning (vice chairs), K. Chono, Y. Lin
techniques in SDIP (G558 and G754) with the NSQT in (vice chairs)
HM5.
 Study and investigate combinations of PU and TU sizes
that lead to the best coding efficiency and complexity
tradeoff
 Provide the WD text that is entirely consistent with the
unified software (HM5+G558+G754).Study schemes of
padding for unavailable neighbouring samples needed by
intra prediction schemes of HEVC HM4.x/WD4,
including the trade-offs between complexity and coding
efficiency performance.
 Identify possibilities for harmonization between the
neighbouring sample padding process with the processing
and usage of neighboring samples for intra prediction
(such as MDIS, intra DC prediction mode, etc.).
 Report the results and conclusions to the JCT-VC.

Page: 241 Date Saved: 2011-12-04


Hooks for sScalable coding investigation (AHG17) J. Boyce (chair), J. Kang, N
J. Samuelsson, K. Minoo,
(jct-vc@lists.rwth-aachen.de)
W. Wan, Y.--K. Wang
 Investigate hooks that would be needed for support of (vice chairs)
bitstream scalability in HEVC syntax
 Study the applicability and effectiveness (e.g., relative to
simulcast and single-layer coding) of scalability tools
 Study potential experimental conditions for evaluation of
scalable video coding technologies
Resolution adaption (AHG18) T. Davies (chair), P. Topiwala, N
P. Wu (vice chair)
(jct-vc@lists.rwth-aachen.de)
 Study methods of resolutions adaptation and their impact
on complexity and compression performance
 Investigate reasonable methods to assess the benefit in
measurable compression performance
 Investigate the impact of different filters (simple/more
complex) for normative down- and up-sampling
 Investigate subjective quality of resolution change e.g.
limiting frequent switching
 Determine the necessity of keeping both resolutions in
buffer and suitable limitations on DPB for managing
complexity
 Investigate the relationship with alternatives such as pre-
and post-filteringAssess the complexity and efficiency of
resolution adaption (as in JCTVC-F158) against
alternative approaches e.g. increased QP, and pre-
filtering.
 Study loss robustness properties and compare with
techniques such as adaptive reference frame selection.
 Investigate suitable resolution switching filters.
 Consider the implications in terms of signalling, HRD,
DPB management, and random access.

Page: 242 Date Saved: 2011-12-04


Lossless cCoding Transform skipping (AHG19) W. Gao (chair), K. Chono, J. Xu, N
M. Zhou,
(jct-vc@lists.rwth-aachen.de)
P. Topiwala(vice chairs)M. Mra
 Study and investigate techniques which enable lossless k (chair), J. Sole, I.-K. Kim,
coding with minimum changes to the current design J. Xu, H. Yu (vice chairs)
 Develop software for support of lossless coding modes
 Produce a single candidate WD text specification with
lossless coding capability for review at the next meeting
 Examine the potential impact to the common test cases
from the lossless coding modesStudy the adequacy of
different entropy coding tools (scans, contexts) with
different transform skip modes
 Study transform skip mode flags signalling in the context
of the RQT design
 Identify and discuss additional issues relating to
transform skip mode
 Study throughput and complexity issues in the context of
transform skipping
Chroma format support (AHG20) D. Flynn (chair), D. Hoang, N
K. McCann, P. Topiwala
(jct-vc@lists.rwth-aachen.de)
(vice chairs)
 Study aspects of the technical design and software that
need modification to support non-4:2:0 chroma formats.
 Assist and advise in the work of removing implicit
assumptions of 4:2:0 formatting from the WD and
software (where feasible, without introducing technical
design changes).
Reference picture buffering and list construction R. Sjöberg (chair), Y. Chen, N
(AHG21) Hendry, T. K. Tan, W. Wan,
Y.- K. Wang
(jct-vc@lists.rwth-aachen.de)
(vice chairs)D. Flynn, R. Sjöberg
 Finalize the test specification document for reference (co-chairs), Y. Chen, T. K. Tan,
picture buffering and list construction proposals for the W. Wan, Y. K. Wang
8th JCT-VC meeting in a timely manner (JCTVC-G1036) (vice chairs)
 Provide source code that enables HM encoding of all test
cases described in JCTVC-G1036 and produce anchor
data
 Identify and work to resolve issues relating to the draft
text description of reference picture handling and list
construction, and the associated HM software
functionality
 Study possible improvements related to reference picture
buffering and list construction
 Study the loss resilience properties of reference picture
handling and its support in the HM software (maybe
move to ain coordination with loss resilience AHG)
 Study possible improvements to the reference

Page: 243 Date Saved: 2011-12-04


picture list construction processes
 Examine the behaviour of unified lists with the
JCTVC-F493 proposed default list construction
processes and identify any robustness issues.
 Examine the merits of the list 0/1, combined and
unified lists
 Discuss and identify solutions for support of
weighted prediction methods
Produce a single candidate WD text specification
for picture buffer management based on JCTVC-
F493 for review at the next meeting
Produce software supporting picture marking
including the following features:
Flexible support for cyclic picture structures and
temporal layering
Decoder support to decode a subset of temporal
layers
 Rudimentary error concealment for lost reference picture
slices (frame copy of closest picture based on POC, set all
MVs of lost reference picture to zero)
HM subjective quality investigation (AHG22) G. Sullivan, J.-R. Ohm (co- N
chairs), F. Bossen, T. Wiegand
(jct-vc@lists.rwth-aachen.de)
(vice chairs)
 Establish a testing scenario for comparison of HM versus
JM together with the test coordinator, Vittorio Baroncini
 Perform encoding of HM and JM test cases.
 Prepare the evaluation of test results as input to the
February 2012 meeting.
Lossless coding (AHG22) W. Gao (chair), K. Chono, J. Xu, N
M. Zhou (vice chairs)
(jct-vc@lists.rwth-aachen.de)
Study and investigate lossless coding techniques.
Consider complexity and efficiency tradeoffs.
Identify appropriate intra and inter predictive techniques for
lossless coding.
Consider bit depth effects.

9 Output documents (tbd)


The following documents were agreed to be produced or endorsed as outputs of the meeting. Names
recorded below indicate those responsible for document production.

JCTVC-F634 HEVC Reference Software Manual [F. Bossen, D. Flynn, K. Sühring (AHG
chairs)]
The intention is to provide this as part of the software package in the future.
Page: 244 Date Saved: 2011-12-04
JCTVC-F688 Revised HEVC Software Guidelines [K. Sühring, D. Flynn, F. Bossen
(software coordinators)]
This version approved at the 6th meeting was approved to still be valid.

JCTVC-G11F800 Meeting Report of 76th JCT-VC Meeting [G. J. Sullivan, J.-R. Ohm]

JCTVC-G11F802 High Efficiency Video Coding (HEVC) Test Model 54 (HM 54) Encoder
Description [K. McCann (primary), B. Bross, W.-J. Han, S. Sekiguchi,
G. J. Sullivan] (WG 11 N 1234185)

JCTVC-G11F803 High Efficiency Video Coding (HEVC) text specification Working Draft
54 [B. Bross (primary), W.-J. Han, G. J. Sullivan, J.-R. Ohm, T. Wiegand]
(WG 11 N 1234186)

JCTVC-G12F900 Common HM test conditions and software reference configurations


[F. Bossen]
Based on update of G1000 as discussed in Wednesday closing plenary.
Software HM5.0 availability Dec. 19
Any adopted proposals where software is not delivered by the scheduled by the scheduled date will be
rejected.
If combinations of proposals are intended to be tested in a CE, the precise description shall be available
with the final CE description, otherwise it cannot be claimed to be part of the CEXX weeks after the
meeting
Document deadline of February 2012 meeting is likely to be Jan. 20XXth.
[Fix links below]

JCTVC-G12F901 Core Experiment 1: Entropy Coding Investigation [R. Joshi (primary),


E. Alshina, H. Kirchhoffer, J. Lainema, H. Sasai]
continue
JCTVC-G12F902 Core Experiment 2: Chroma RQT depth [K. Sugimoto]Motion Partitioning
and OBMC [X. Zheng (primary), I. S. Chong, I.-K. Kim]
discontinue
New CE2 from G283, G442 and G980: Chroma RQT depth (K. Sugimoto)
JCTVC-G12F903 Core Experiment 3: Chroma Interpolation Filter Motion Compensation [T.
Chujoh, E. Alshina]
continue
Only G698, including subjective investigation.
Remaining issues to be investigated are related to memory bandwidth AHG

JCTVC-G12F904 Core Experiment 4: Quantization [K. Sato (primary), H. Aoki, M.


Budagavi, M. Coban, X. Li]

Page: 245 Date Saved: 2011-12-04


JCTVC-G12F905 Core Experiment 5: Transform skipping [M.Mrak, J. Sole, J. Xu, A.
Saxena]CAVLC Entropy Coding Improvement [X. Wang (primary), P. Wu,
C. Y. Kim]
discontinue
New CE5: Transform skipping (Marta Mrak)
JCTVC-G12F906 Core Experiment 6: Intra coding improvements [A. Tabatabai (primary),
E. François, K. Chono, R. Joshi, J. Lainema, H. Yu]
Includes all intra coding aspects (also chroma prediction, intra mode coding simplification, SDIP
improvements over "reference" of AHG).

JCTVC-G12F907 Core Experiment 7: Additional transforms [R. Cohen (primary),


F. Fernandes, R. Joshi, C. Yeo]

JCTVC-G12F908 Core Experiment 8: Non-deblocking loop filtering [T. Yamakage


(primary), I. S. Chong, M. Narroschke, Y.-W. Huang]
continue

JCTVC-G12F909 Core Experiment 9: MV cCoding and sSkip/mMerge oOperation [I. K.


KimB. Bross (primary), W. J. Chien, J. Jung, I. K. Kim, M. Zhou]
Continue, combined with former CE13

JCTVC-G1210 Core Experiment 10: Deblocking filter [A. Norkin (primary), X. Guo,
B. Jeon, M. Narroschke]

JCTVC-G1211 Core Experiment 11: Coefficient scanning and coding [V. Sze (primary),
J. Chen, T. Nguyen, K. Panusopone, J. Sole]

JCTVC-F910 Core Experiment 10: Core Transforms [P. Topiwala (primary), M. Budagavi, A.
Fuldseth, R. Joshi, E. Alshina]
JCTVC-F911 Core Experiment 11: Coefficient Scanning and Coding [V. Sze (primary), J. Chen,
T. Nguyen, K. Panusopone, J. Sole]
JCTVC-F912 Core Experiment 12: Deblocking Filter [A. Norkin (primary), X. Guo, B. Jeon,
M. Narroschke]
continue
JCTVC-F913 Core Experiment 13: Motion data parsing robustness and throughput [J. Jung
(primary), B. Bross, J. Chen, P. Onno, M. Zhou]
combine with CE 9
New CE: Transform skipping (Marta Mrak)
Continuing CE with same chair(s) as before
CE descriptions to be reviewed Tues afternoon and/or Wed morning

Page: 246 Date Saved: 2011-12-04


10 Future meeting plans, expressions of thanks, and closing of the
meeting
Future meeting plans were established according to the following guidelines:
 Meeting under ITU-T SG 16 auspices when it meets (starting meetings on the Monday or
Tuesday of the first week and closing it on the Tuesday or Wednesday of the second week of the
SG 16 meeting), and
 Otherwise meeting under ISO/IEC JTC 1/SC 29/WG 11 auspices when it meets (starting
meetings on the Wednesday or Thursday prior to such meetings and closing it on the last day of
the WG 11 meeting).
Some specific future meeting plans were established as follows:
 1–10 February 2012 under WG 11 auspices in San José, USA.
 30 April – 8 1–9 May 2012 under ITU-T auspices in Geneva, CH.
 11–20 July 2012 under WG 11 auspices in Stockholm, SE.
 10–19 October 2012 under WG 11 auspices in Suzhouhanghai, CNH.
 167–23 January 2013 under ITU-T auspices in Geneva, CH.
 …
ITU was thanked for its excellent hosting of the 7th meeting of the JCT-VC. XXX were thanked and for
providing the viewing equipment used at the meeting. David Flynn and Kenneth Andersson were thanked
for assisting with the setup and operation of the equipment.
The JCT-VC meeting was closed at approximately XXXX 1335 hours on Wednesday 30 Nov 2011.

Page: 247 Date Saved: 2011-12-04


Annex A to JCT-VC report:
List of documents

Page: 248 Date Saved: 2011-12-04


Annex B to JCT-VC report:
List of meeting participants
The participants of the sixth meeting of the JCT-VC, according to a sign-in sheet passed around during
the meeting (approximately 284XXX in total), were as follows:

1. Daniele Alfonso (STMicroelectronics) 31. Tzu-Der Chuang (MediaTek)


2. Rik Allen (Altera Corporation) 32. Takeshi Chujoh (Toshiba Corporation)
3. Elena Alshina (Samsung Electronics) 33. Muhammed Coban (Qualcomm Inc)
4. Peter Amon (Siemens AG) 34. Robert Cohen (Mitsubishi Electric)
5. Kenneth Andersson (LM Ericsson) 35. Thomas Davies (Cisco Norway (Tandberg))
6. Pierre Andrivon (Technicolor) 36. Jan De Cock (Ghent University - IBBT)
7. Hirofumi Aoki (NEC Corporation) 37. Jie Dong (InterDigital Communications, LLC)
8. Oscar Au (Hong Kong Univeristy of Science and 38. Alberto Duenas (Cavium, Inc.)
Technology)
39. Alex Eleftheriadis (Vidyo, Inc.)
9. Cheung Auyeung (Sony Corp.)
40. Semih Esenlik (Panasonic Corporation)
10. Oguz Bici (Nokia)
41. Pascal Eymery (Allegro DVT)
11. Lazar Bivolarsky (Skype)
42. Xue Fang (Motorola Moblity)
12. Gisle Bjøntegaard (Cisco Systems Norway)
43. Eyal Farkash (NDS)
13. Philippe Bordes (Technicolor)
44. Felix Fernandes (Samsung Telecommunications
14. Frank Bossen (DOCOMO Innovations, Inc.) America)
15. Stephen Botzko (Polycom) 45. David Flynn (BBC Research & Development)
16. Jill Boyce (Vidyo, Inc.) 46. Chad Fogg (Harmonic Inc.)
17. Benjamin Bross (Fraunhofer HHI) 47. Rémy Foray (Allegro DVT)
18. Madhukar Budagavi (Texas Instruments Inc) 48. Edouard Francois (Canon)
19. Junwon Byun (Yonsei University) 49. Akira Fujibayashi (NTT DOCOMO,Inc.)
20. Peisong Chen (Broadcom Corporation) 50. Shigeru Fukushima (JVC KENWOOD
Corporation)
21. Weizhong Chen (Huawei technologies
CO.,LTD) 51. Arild Fuldseth (Ciso Systems Norway)
22. Ying Chen (Qualcomm Inc.) 52. Wen Gao (Huawei Technologies (USA))
23. Wei-Jung Chien (Qualcomm) 53. Patrick Gendron (Thomson Video Networks)
24. Yi-Jen Chiu (Intel Corp.) 54. Christophe Gisquet (Canon Research Centre
France S.A.S.)
25. Seunghyun Cho (Electronics and
Telecommunications Research Institute) 55. Hsan Guermazi (eBrisk video Inc.)
26. Dugyoung Choi (Chips&Media) 56. Laurent Guillo (IRISA/CNRS)
27. Hyomin Choi (Kwangwoon University (KWU)) 57. Thomas Guionnet (IRISA / INRIA Rennes)
28. Kiho Choi (Hanyang University) 58. Xun Guo (MediaTek (Beijing) Inc.)
29. In Suk Chong (Qualcomm) 59. Antti Hallapuro (Nokia)
30. Keiichi Chono (NEC) 60. Woo-Jin Han (Kyungwon University)

Page: 249 Date Saved: 2011-12-04


61. Jong-Ki Han (Sejong University) 96. Abdellatif Khindouf (ALLEGRO DVT)
62. Miska Hannuksela (Nokia Corporation) 97. Munchurl Kim (Korea Advanced Institute of
Science and Technology)
63. Munsi Haque (Sony Corporation)
98. Yonghoun Kim (Hanyang uiveristy)
64. Shinobu Hattori (Sony Corp.)
99. Il-Koo Kim (Samsung Electronics Co., Ltd.)
65. Yong He (InterDigital Communications Corp)
100. Joohyeok Kim (Hanyang University)
66. Dake He (Research In Motion)
101. Hui Yong Kim (ETRI)
67. Yun He (The MIIT of P.R.China)
102. Jongho Kim (Yonsei University)
68. Tim Hellman (Broadcom Corporation)
103. Chanyul Kim (Samsung electronics. Ltd)
69. Hendry Hendry (LG Electronics)
104. Hae Kwang Kim (Sejong University)
70. Anastasia Henkel (Fraunhofer HHI)
105. Jaeil Kim (Korea Advanced Institute of
71. Felix Henry (France Telecom R&D) Science and Technology)
72. Yingjie Hong (ZTE) 106. Kyung Yong Kim (Kyunghee University
73. Sung-Wook Hong (Sejong University) Media Lab.)

74. Michael Horowitz (eBrisk Video, Inc.) 107. Jungsun Kim (LG electronics)

75. Shih-Ta Hsiang (MediaTek) 108. Heiner Kirchhoffer (Fraunhofer HHI)

76. Chih-Wei Hsu (MediaTek Inc.) 109. Kenji Kondo (Sony Corporation)

77. Yu-Wen Huang (MediaTek) 110. Faouzi Kossentini (eBrisk Video Inc.)

78. Wai Lam Hui (The Hong Kong Polytechnic 111. Anand Kotra (Panasonic R&D Center
University) Germany GmbH)

79. Atsuro Ichigaya (NHK (Japan Broadcasting 112. Jumpei Koyama (FUJITSU
Corporation)) LABORATORIES LTD.)

80. Kwon Jae Cheol (KT Corporation) 113. Thomas Kunlin (STMicroelectronics)

81. Wonkap Jang (Vidyo Inc.) 114. Changcai Lai (Huawei)

82. Byeong Moon Jeon (LG Electronics) 115. Polin Lai (Samsung)

83. Byeungwoo Jeon (SKKU) 116. Jani Lainema (Nokia)

84. Yongjoon Jeon (LG Electronics) 117. Guillaume Laroche (Canon Research Centre
France S.A.S)
85. Rajan Joshi (Qualcomm)
118. Fabrice Le Léannec (Canon Research Centre
86. Joel Jung (Orange Labs) France S.A.S)
87. Jiwook Jung (LG Electronics) 119. Ju Ock Lee (Sejong University)
88. Sungwook Jung (Korean Standards Association) 120. Jae Yung Lee (Sejong University)
89. Jewon Kang (Nokia) 121. Snagyoun Lee (Yonsei University)
90. Jung Won Kang (ETRI (Elecctronics and 122. Wonjae Lee (Samsung electronics)
Telecommunications Research
Institute)) 123. Jinho Lee (ETRI)

91. Marta Karczewicz (Qualcomm) 124. Bae-Keun Lee (KT)

92. Damian Karwowski (Poznan University of 125. Chulhee Lee (KCC/Yonsei)


Technology) 126. Hoyoung Lee (Sungkyunkwan Univ.)
93. Kei Kawamura (KDDI Corp.) 127. Sukho Lee (ETRI)
94. Kimihiko Kazui (FUJITSU LABORATORIES 128. Jaeyong Lee (Kwangwoon University)
LTD.)
95. Louis Kerofsky (Sharp Electronics)

Page: 250 Date Saved: 2011-12-04


129. Tammy Lee (Samsung Electronics 162. Matteo Naccari (British Broadcasting
Corporation) Corporation - Research and
Development)
130. Sunyoung Lee (Pantech)
163. Sue Mon Thet Naing (Panasonic
131. Shawmin Lei (MediaTek) Corporation)
132. Xiang Li (MediaTek (Beijing) Inc.) 164. Sei Naito (KDDI Corporation)
133. Guichun Li (Huawei Technologies Co. Ltd.) 165. Takayuki Nakachi (NTT)
134. Zhengguo Li (Institute for Infocomm 166. Hiroya Nakamura (JVC KENWOOD
Research) Corporation)
135. Ming Li (ZTE Corporation) 167. Junghak Nam (Kwangwoon University
136. Chongsoon Lim (Panasonic Corporation) (KWU))

137. Jaehyun Lim (LG Electronics) 168. Matthias Narroschke (Panasonic R&D
Center Germany)
138. Sung-Chang Lim (ETRI)
169. Tung Nguyen (Fraunhofer Gesellschaft)
139. Chun-Lung Lin (ITRI International)
170. Takahiro Nishi (Panasonic)
140. Peter List (Deutsche Telekom)
171. Andrey Norkin (Ericsson)
141. Lingzhi Liu (Huawei Technologies USA)
172. Jens-Rainer Ohm (RWTH Aachen
142. Shan Liu (MediaTek) University)
143. Zhong Luo (Huawei Technologies) 173. Patrice Onno (Canon Research Centre
144. Ajay Luthra (Motorola Mobility Incs) France S.A.S)

145. Siwei Ma (Pek) 174. Simon Oudin (Fraunhofer HHI)

146. Detlev Marpe (Fraunhofer HHI) 175. Krit Panusopone (Motorola Mobility)

147. Gaelle Martin-Cocher (Research in Motion) 176. Joonyoung Park (LG electronics)

148. Ikeda Masaru (Sony Corporation) 177. Seungwook Park (LG Electronics)

149. Masaaki Matsumura (NTT Corporation) 178. Youngo Park (SAMSUNG ELECTRONICS
Co., Ltd.)
150. Shohei Matsuo (NTT Corporation)
179. Dongjin Park (Chips&Media)
151. Ken McCann (ZetaCast/Samsung)
180. Sang-Hyo Park (Hanyang University)
152. Holger Meuel (Leibniz University
Hannover) 181. Jiho Park (KETI)

153. Akira Minezawa (Mitsubishi Electric 182. Jeonghoon Park (Samsung Electronics Co.,
Corporation) Ltd.)

154. Koohyar Minoo (Motorola Mobility Inc.) 183. Nicolas Pellerin (ST microelectronics)

155. Kiran Misra (Sharp Corporation) 184. Wen-Hsiao Peng (ITRI


International/NCTU)
156. Kazuyuki Miyazawa (Mitsubishi Electric
Corporation) 185. Isabelle Perroux (ALLEGRO DVT)

157. Seiji Mochizuki (Renesas Electronics 186. Yinji Piao (Samsung Electronics)
Corporation) 187. Matthias Preiß (Fraunhofer Heinrich Hertz
158. Dhawal Moghe (Cable Television Labs) Institut (HHI))

159. Fulvio Moschetti (European Patent Office) 188. Mohamad Raad (RaadTech Consulting)

160. Marta Mrak (BBC) 189. Shevach Riabtsev (CSR)

161. Tokumichi Murakami (Mitsubishi Electric 190. Justin Ridge (Nokia Oyj)
Corporation) 191. Arturo Rodriguez (Cisco Systems)
192. Francisco Javier Roncero (Cavium, Inc.)

Page: 251 Date Saved: 2011-12-04


193. Christopher Rosewarne (CANON Inc) 223. Toshiyasu Sugio (Panasonic Corporation)
194. Dmytro Rusanovskyy (Nokia) 224. Gary Sullivan (Microsoft Corporation)
195. Thomas Rusert (Ericsson AB) 225. Huifang Sun (Mitsubishi Electric Research
Labs)
196. James Russell (Altera Corporation)
226. Jaewon Sung (LG Electronics)
197. Shinichi Sakaida (NHK)
227. Yoshinori Suzuki (NTT DOCOMO, Inc.)
198. Jesus Sampedro (Polycom)
228. Teruhiko Suzuki (Sony Corp.)
199. Jonatan Samuelsson (Telefonaktiebolaget
LM Ericsson) 229. Vivienne Sze (Texas Instruments)
200. Hisao Sasai (Panasonic) 230. Ali Tabatabai (Sony Electronics)
201. Kazushi Sato (Sony Corp.) 231. Yoshitomo Takahashi (Sony Corporation,
Sony City Osaki)
202. Nicholas Saunders (Broadcast &
Professional Research Labs, Sony 232. Seishi Takamura (NTT Cyber Space
Europe Ltd) Laboratories, NTT Corporation)
203. Ankur Saxena (Samsung 233. Thiow Keng Tan (NTT DOCOMO, Inc)
Telecommunicationa America)
234. Akiyuki Tanizawa (TOSHIBA Corporation)
204. Thomas Schierl (Fraunhofer Heinrich Hertz
Institute) 235. Herbert Thoma (Fraunhofer IIS)

205. Heiko Schwarz (Fraunhofer Gesellschaft zur 236. Ikai Tomohiro (Sharp Corporation)
Förderung der angewandten Forschung 237. Pankaj Topiwala (FastVDO LLC)
(FGFF))
238. Matsunobu Toru (Panasonic)
206. Andrew Segall (Sharp Corporation)
239. Cong-Thang Truong (University of Aizu)
207. Shun-Ichi Sekiguchi (Mitsubishi Electric
Corporation) 240. Yi-Shin Tung (MStar Semiconductor, Inc /
ITRI International Inc.)
208. Stian Selnes (Cisco Systems)
241. Kemal Ugur (Nokia)
209. Vadim Seregin (Qualcomm Incorporated)
242. Geert Van Der Auwera (Qualcomm Inc.)
210. Osman Gokhan Sezer (Texas Instruments
Inc.) 243. Glenn Van Wallendael (Ghent University -
IBBT - Multimedia Lab)
211. Karl Sharman (Broadcast & Professional
Research Labs, Sony Europe Ltd.) 244. Jérôme Vieron (ATEME)

212. Bz (bazhong) Shen (Broadcom) 245. Viktor Wahadaniah (Panasonic Corporation)

213. Masato Shima (Canon Inc.) 246. Wade Wan (Broadcom Corporation)

214. Satoshi Shimada (FUJITSU 247. Xianglin Wang (Qualcomm Inc.)


LABORATORIES LTD.) 248. Yunfei Wang (Tsinghua University)
215. Donggyu Sim (Kwangwoon University) 249. Jing Wang (Research In Motion Limited)
216. Didier Siron (STMicroelectronics) 250. Dong Wang (ZTE Corporation)
217. Rickard Sjöberg (Telefonaktiebolaget LM 251. Ye-Kui Wang (Qualcomm Incorporated)
Ericsson)
252. Thomas Wedi (Panasonic)
218. Joel Sole (Qualcomm)
253. Xing Wen (Hong Kong Univeristy of
219. Eunyong Son (LG Electronics) Science and Technology)
220. Jakub Stankowski (Poznan University of 254. Stephan Wenger (Vidyo)
Technology)
255. Thomas Wiegand (FGFF)
221. Karsten Suehring (Fraunhofer HHI)
256. Mathias Wien (RWTH Aachen University)
222. Kazuo Sugimoto (Mitsubishi Electric
Corporation)

Page: 252 Date Saved: 2011-12-04


257. Kwanghyun Won (Sungkyunkwan
University (SKKU))
258. Stewart Worrall (Aspex Semiconductor)
259. Ping Wu (ZTE (UK) Ltd)
260. Jizheng Xu (Microsoft Corp.)
261. Tomoo Yamakage (Toshiba Corporation)
262. Tomoyuki Yamamoto (SHARP
Corporation)
263. Sevlgi Yang (Yonsei University)
264. Haitao Yang (Huawei Technologies Co.,
Ltd.)
265. Yan Ye (InterDigital Communications)
266. Sehoon Yea (LG Electronics)
267. Chuohao Yeo (Institute for Infocomm
Research)
268. Sang Yong Yi (Korea Aerospace
University)
269. Tan Yih Han (Institute for Infocomm
Research)
270. Peng Yin (Dolby Laboratories, Inc.)
271. Shibahara Youji (Panasonic Corporation)
272. Yue Yu (Motorola Mobility)
273. Yong Yu (Broadcom Corp)
274. Haoping Yu (Huawei Technologies (USA))
275. Lu Yu (Zhejiang University)
276. Yichen Zhang (Zhejiang University)
277. Wenhao Zhang (Intel Corporation)
278. Xingyu Zhang (Hong Kong Univeristy of
Science and Technology)
279. Wen Zhang (ZTE)
280. Louis (Lei) Zhang (AMD)
281. Xiaozhen Zheng (Huawei Technologies)
282. Minhua Zhou (Texas Instruments Inc)
283. Xiaosong Zhou (Apple Inc.)
284. Hongbo Zhu (None)…

Page: 253 Date Saved: 2011-12-04

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy