Basics
Basics
m ent
n er
is clicked, follow a URL hyperlink, etc.).
(s)
n
io b
Security. Multimedia. Actions.
tio
Nu um
r
pt m
on
Pu al)
be
s AT Assistive Technology. Associated with PDF/UA and Tagged PDF.
(o Nu
ga rd
ca
oc
Un izati PDF 1.2 1996 Prepress features: CMYK, spot color, halftoning, overprinting,
bli
or nda
eD
rt
OPI (Open Prepress Interface). Flate compression. New types of BBox Bounding Box. A common key name.
Pa
n
Sta
iq u
annotations. Interactive forms.
of
Conformance Represented by letter designators after a PDF subset acronym, (e.g.,
ar
PDF 1.3 2000 DeviceN color. 2-byte CID fonts. Smooth shadings. More types of level PDF/A-1b, PDF/X-5pg, PDF/VT-2s). Each Conformance Level has its own
Ye
PDF 2.0 ISO 32000 Latest core PDF specification that applies to all PDF files. annotations. Large media. Page labels. Digital signatures. specialized set of rules and requirements.
Fully vendor-neutral specification of every non-obsoleted JavaScript. Alternate images. Masked images.
COS Carousel Object Syntax. The syntax used by PDF and FDF files.
PDF feature since PDF 1.0. PDF 1.4 2001 Transparency and blend modes. Improved security. More "Carousel" was the codename for Acrobat 1.0.
PDF/A ISO 19005 Archival. PDF for the long-term preservation of static page prepress features. JBIG2 images.
Cross-reference (PDF 1.5 and later only) Cross-reference information stored in a stream
appearance. PDF 1.5 2003 JPEG 2000. Layers (optional content). Tagged PDF. Object stream instead of a standard cross-reference xref table. Trailer dictionary
PDF/X ISO 15930 eXchange. PDF for graphic arts and professional printing streams and cross reference streams for better compression. entries are in cross reference stream dictionary.
workflows including packaging and labelling. PDF 1.6 2004 OpenType. Ultra-large media. Watermarking. Visibility Destination An object defining a view of a document, comprising a page, the
PDF/UA ISO 14289 Universal Accessibility. PDF supporting accessibility and expressions. AES encryption. Interactive 3D with U3D. location of the document window on that page, and zoom factor.
assistive technology such as screen readers for those with Measurement properties.
Direct object PDF object that occurs inline where it is defined and does not have its
vision impairments. PDF 1.7 2006 Portable collections (packages). 3D enhancements. Redaction own object identifier (object number and generation number pair).
PDF/R ISO 23504 Raster. Simplified PDF that uses banded images and that is annotations. Standardized as ISO 32000-1:2008.
FDF Forms Data Field file format to store interactive form data
easy to create by low-resource embedded devices such as PDF 2.0 2017 ISO 32000-2. UTF-8 strings. 256-bit AES-CBC encryption. Unicode
consumer flatbed scanners. ISO version of PDF/raster. 2020 passwords. Black point compensation. Rich media annotations. Hybrid- (PDF 1.5 and later only). PDF containing objects referenced by
PAdES. PRC for 3D. Geospatial features. Document parts. reference PDF conventional cross-reference tables in addition to objects in object
PDF/VT ISO 16612 Variable/Transactional for high-speed variable data (VDP)
Associated Files. Metadata streams. Deprecation of older streams referenced by cross-reference streams.
and transactional printing in the graphic arts and
professional printing industries. Builds on PDF/X. encryption and other legacy features, including XFA. Linearized PDF Commonly referred to as Fast Web View.
PDF/VCR ISO 16613 Variable Content Replacement. Templatized PDF/X PDF 2.0 Extensions 256-bit AES-GCM encryption. Extensions to hash algorithms, obj Object abbreviation. A reserved PDF keyword.
supporting late-stage variable content merging, such as elliptical curves, integrity protection via MACs. STEP and glTF for
adding batch numbers on pharmaceutical packaging. 3D. Clarification on PDF 1.7 and PDF 2.0 namespace inclusion. Object stream (PDF 1.5 and later only). A stream in which indirect objects may be
stored, as an alternative to being stored in PDF body sections.
PDF/E ISO 24517 Engineering. PDF 1.6 based subset to support engineering-
OCG Optional Content Group. A selectable “layer” of page content.
centric 3D PDF workflows such as aerospace and automotive
engineering. Superseded by PDF/A-4.
Common terms for PDF features Owner Password with full (owner) access, including ability to change passwords
Bookmarks Outlines which use Actions or Destinations. Password and access permissions of the PDF document.
PDF/raster A single PDF can conform Comments Markup annotations. PAdES PDF Advanced Electronic Signatures. ETSI standard EN 319 142.
to multiple subsets and
PDF/X-5pg conformance levels. Compression
Fast Web View
Filters on streams. Object streams. Cross reference streams.
Linearization.
Page Label Optional descriptive text for referring to pages that can be shown on-
screen (e.g., i, ii, iii, …, Chapter 1, Chapter 2, etc). This contrasts with the
PDF/A-4f
zero-based integer page index used internally in PDF files.
Files Embedded Files (incl. media) and File Attachment annotations. startxref Reserved PDF keyword that occurs just before the %%EOF end-of-file
Subset Conformance Level(s) Forms Widget annotations and Fields. Also referred to as AcroForm. comment marker along with the byte offset to the cross-reference data
PDF subset
(optional, lowercase letters) for the PDF file (expressed as an integer in ASCII).
Uppercase letters = ISO
Hyperlinks Link annotations, URI actions. Actions and Destinations.
Lowercase = Industry trailer The trailer dictionary is required in every PDF and defines special objects
JavaScript (JS) ECMAScript for PDF (ISO 21757). ECMAScript Actions. (e.g., largest object number, the Document Catalog root). Also keyword.
Version of subset
(optional) Layers Optional Content (OC), Visibility Expressions. Marked Content. User Password Password with restricted access permissions (as set by an author).
Multimedia 3D, Movie, Screen, and RichMedia annotations with Actions. Widget A subtype of PDF annotation used with interactive forms that represent
the GUI “widgets” through which data entry is done.
IANA Media Types Page size The page MediaBox.
XFA XML Forms Architecture. Proprietary XML-based specifications
Portfolios Also called Collections or Packages. A collection of embedded files.
application/pdf Registered Media Type for all PDF files. See RFC 8118. supporting dynamic forms. Deprecated in PDF 2.0.
Properties Document Information dictionary and XMP Metadata streams.
application/fdf Registered Media Tpe for FDF (Forms Data Field) files. XFDF XML-based version of FDF defined by ISO 19444-1.
See ISO 32000 for FDF file specification. Images of content such as from scanner or camera. Often has
Scanned PDF XMP eXtensible Metadata Platform. XML-based metadata standard (ISO
OCR-ed invisible text on top of images allowing text selection
application/xfdf Registered Media Tpe for XFDF (XML Forms Data Field) files. 16684) used by many file formats. Required by PDF subsets and PDF 2.0.
See ISO 19444-1 for XFDF file format specification. Security Encryption, Crypt filters, and Digital Signatures.
xref Reserved PDF keyword that indicates the start of a standard cross-
Tags Tagged PDF, including Marked Content and Logical Structure. reference table. Often shorthand for “cross reference table”.
Resolved errata at https://pdf-issues.pdfa.org
Report errata at https://github.com/pdf-association/pdf-issues/
EOL Any End-of-Line sequence (see above) Real 1.23 • Signed decimal floating-point numbers.
Outline
hierarchy
•
•
•
Number -45.6 • No exponential or scientific formats. Outline
% … PDF comments (starting from % to EOL) are treated as single white space +7.8 entry
• Integers can be used for real numbers.
-.9
Token Delimiter symbols 0. Structure
element
Document catalog
( Literal string start token String (literal string) • Encrypted PDFs encrypt string objects. •
•
Structure tree
(balanced () ok) • Unicode strings with byte order markers.
•
) Literal string end token (unbalanced \() Structure
• Backslash escape sequences for literal element
<, << Hex string start token / dictionary start (<<) token (line \
strings:
>, >> Hex string end token / dictionary end (>>) token break) Metadata
(line \nbreak) Sequence Meaning reference Embedded
[ Array start token (octal \234 code) \n LF (0x0A)
files
Names ••
] Array end token \r CR (0x0D) dictionary •
JavaScripts
/ PDF name \t Horizontal Tab (0x09)
<hex-string> \b
% Comment to end-of-line (outside of a string or inside a content stream) Backspace (0x08) Interactive
<48656c6c6F> form
{ \f Formfeed (0x0C)
Only in Type 4 PostScript calculator functions <41424> % 0 added
} \) Left parenthesis Collections
Only in Type 4 PostScript calculator functions
\) Right parenthesis
Document
\\ Backslash parts
File Structure \ddd Octal code. 1-3 digits
Information
See Figure 5 in ISO 32000-2:2020
(when not using cross-reference streams (PDF 1.5)) string types and controls
>>
encrypted using Filters.
stream
…stream data… • Always an indirect object. EI BI Do (immediate)
to the PDF file, allowing edits and changes without rewriting the full PDF. Link to previous Indirect • Object number then generation number.
10 0 R
PDF state is via /Prev entry in the trailer dictionary to previous xref. Reference • Method to refer to another object.