0% found this document useful (0 votes)
100 views89 pages

PDFA in A Nutshell - 1b PDF

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 89

PDF/A in a Nutshell

Long Term Archiving with PDF


Olaf Drümmer, Alexandra Oettler, Dietrich von Seggern

■ Accessibility

■ Contracts and Forms

■ High-volume PDF/A creation

■ PDF/A with Acrobat 8 Professional

■ Scanning documents to create PDF/A


PDF/A
Competence Center
■ PDF/A from Microsoft Office 2003 and 2007
Olaf Drümmer, Alexandra Oettler, Dietrich von Seggern

PDF/A in a Nutshell
Long-Term Archiving with PDF
Olaf Drümmer
o.druemmer@callassoftware.com
Alexandra Oettler
pdfakompakt@alexandra-oettler.de
Dietrich von Seggern
d.seggern@callassoftware.com

ISBN: 978-3-9811648-1-7

Bibliographic information published by Die Deutsche Bibliothek


Die Deutsche Bibliothek lists this publication in the Deutsche Nationalbibliografie.
Detailed bibliographic data is available at <http://dnb.ddb.de>.

This work and all its parts are protected by copyright. All rights, including translation, reproduction, presentation, use of illustrations and tables, radio
broadcasting, microfilming, any other means of replication, and storage in data processing systems, are reserved. This also applies to extracts. Any
replication of this work or of parts thereof, even in isolated cases, is only permissible in accordance with the currently valid version of the German
copyright legislation of September 9th 1965. A copyright fee must always be paid. Violations fall under the prosecution act of German Copyright Law.

© 2007 callas software GmbH, Berlin


Published by Association for Digital Document Standards ADDS – PDF/A Competence Center, Berlin – www.pdfa.org
Translation: © 2007 Association for Digital Document Standards ADDS – PDF/A Competence Center, Berlin

Printed in Germany

The use of general descriptive names, trade names, trademarks, and so on, in this publication, even if not specifically identified, does not imply that
these names are not protected by the relevant laws and regulations or that they can be used by anyone.
Layout, design, and composition:  Alexandra Oettler; Cover design:  Anja Godolt; Cover picture:  Sepp Huberbauer – photocase.com/de
Printing:  Galrev Druck- und Verlagsgesellschaft Hesse & Partner OHG
Preface
Our world is getting more digital by the day. have to be migrated on a regular basis, in or-
A lot of information and documents only der that newer versions of the processing
exist in digital form today, but will they still software can still read them.
be legible „tomorrow“? That was the theme Employees working on customer dos-
of an interesting TV show appropriately siers aren‘t really impressed when 10 differ-
called „The Digital Disaster“. It began with ent viewing programs are opened up at the
cave drawings from the stone age and papy- same time. In some of the programs they
rus rolls from ancient Egypt, both of which might not even know how to navigate
have survived as documents for thousands around in a document. In order to solve
of years. What documents from the 21st this problem, a document and archiving
century will future generations be able to format is needed that guarantees the re-
find and still read? But it‘s happening much quired long-term archiving period and of-
quicker than you may realize. I always carry fers the option of a single format type.
a 3½ inch floppy disk in my pocket, and it This is where PDF/A as an ISO standard
demonstrates a lot of the problems of long- for long-term archiving enters the stage.
term archiving. It begins with the hardware: The „A“ stands for „Archive“ and the PDF/A
where can you buy a 3½ inch floppy disk to- standard was specifically created for long-
day? And even if you find one, there‘s a good term archiving. It envisions a single PDF/A
chance that the disk is physically damaged. archive for all documents in an organiza-
If these two hardware hurdles are success- tion, from input through to output, and in-
fully cleared, then what kind of software or cludes all of the areas inbetween.
document will we find on the floppy disk? You will find many more advantages to
Are the appropriate viewing and processing PDF/A on the following pages, written with
programs still available? And this example the aim of converting the very formal ISO
is a mere 15 years old! standard into a form that is easily under-
My short anecdote leads us to the de- stood and enhanced with practical exam-
mand on the long-term archiving of docu- ples. Since PDF/A resolves a lot of the criti-
ments. Electronic archiving is critical for cal problems that users have, the PDF/A
businesses and organizations, because doc- Competence Center was formed as an as-
uments today often only exist in digital for- sociation with the aim of providing infor-
mat. The length of time that business docu- mation over PDF/A, promoting the distri-
ments have to be archived varies from sec- bution of the standard, and acting as a cen-
tor to sectors and country to country, but tral point of contact for your questions
some examples can help us to get an idea. dealing with PDF/A. We hope that this
Federal laws often requires an archiving booklet gives you a good overview and in-
period of around 10 years. Banks and in- troduction to PDF/A, and also helps as a
surances demand that customer dossiers be motivator for implementing the standard.
retained for more than 50 years. In the en-
gineering branch, archival periods of 100 Berlin, in September, 2007
years are common for aircraft, bridges Thomas Zellmann,
hopefully hold a whole lot longer. Chairman PDF/A Competence Center
And saving documents in proprietry for-
mats for this length of time is really not a PS: a special thanks goes out to our mem-
good idea. This leads to the second problem ber callas software GmbH, who initiated
with the digital document world - that many the German version of this booklet and
users already have a real „format zoo“, which provided it to the PDF/A Competence Cen-
can quickly become unmanageable (if it isn‘t ter for translation into English and for fur-
already so). Proprietary document formats ther distribution.

PDF/A in a Nutshell 3
Preface
Throughout history, it has always been im- The joint effort of AIIM and NPES
portant to preserve our past for future gen- brought together the document and con-
erations. Until the last 20 years in our paper tent management experts with the graphics
centric world, this was a fairly easy task. experts who had already developed the
One would simply take the folders of pa- PDF/X family of standards. When we an-
pers or other objects that were to be pre- nounced the proposed work to develop a
served and send them off to an archive for subset of PDF tags for long-term preserva-
safe keeping or place them in a fire retar- tion of electronic documents, we were over-
dant container. With electronic documents whelmed by the interest to participate from
this task is not as easily approached, which virtually every area in the world.
is how PDF/Archive or PDF/A came into AIIM’s expertise as an accredited stan-
being. dards developer and the secretariat of ISO
PDF/Archive addresses the growing need TC 171, Document Management Applica-
to electronically archive documents in a tions and ISO TC 171 SC2, Document Ap-
way that would ensure preservation of their plications, AIIM brought to the project the
contents over an extended period of time. means for gaining ISO approval and wider
Additionally, it ensures that the documents adoption of the standard. ISO 19005-1,
will be able to be retrieved and rendered Document management – Electronic docu-
with a consistent and predictable result ment file format for long-term preservation
each time they are viewed. – Part 1: Use of PDF 1.4 (PDF/A-1) became
AIIM, the Enterprise Content Manage- an approved ISO standard within 22
ment Association, and NPES – The Asso- months of introduction as a new project
ciation for Suppliers of Printing, Publish- through the dedicated efforts of many re-
ing and Converting Technologies were ap- cords managers, archivists, software devel-
proached by numerous organizations opers and end users.
which were being faced with the need to While adoption of the standard has been
preserve over long periods of time, large a little slower than we had anticipated, we
quantities of electronic documents. After are encouraged by the continuing interest
reviewing the options of maintaining this and growing adoption of the standard. This
electronic history in TIFF, XML, native book along with the continuing efforts of
format or PDF, it was decided that PDF AIIM and the PDF/A Competence Centre
would be the best format as it would enable will continue to increase the adoption rate
the accurate rendering of the document as of PDF/A in the industry.
it had been intended to be displayed. How-
ever, in order to ensure the long term pres- Silver Spring, in September, 2007
ervation of the electronic documents, PDF Betsy Fanning
would need to be enhanced slightly. AIIM, Director, Standards

4 PDF/A in a Nutshell
Table of Contents

Table of Contents
Durable documents with the PDF/A standard

Open files are not always complete 9


TIFF as an archive format 9

PDF data containers 10


Why PDF/A and not PDF? 11
The introduction of the PDF/A standard 11

How to create archive PDFs 12


Who stands to benefit from PDF/A? 13

Table: Comparison between PDF/A-1a and PDF/A-1b 15

Overview: Which file formats are suitable for archiving? 16

Is XPS an alternative to PDF/A? 18

PDF/A creation: Analog, digital, and mass processing


Illustrations: PixelQuelle.de

PDF/A from scanned documents 21


Scanning options in Acrobat 8 Professional 22
Converting pages that have already been scanned to PDF/A 23

The Distiller engine 25


PDF/A document generation using the Distiller 25

Office and administration 28


PDF/A in Office 2007 28
Office 2003 and the PDFMaker 29
PDF/A using the 3-Heights PDF Producer 31

PDF/A ‘en masse’ 32


PDF/A ‘from nothing’ 32
Creating PDF/A from print data streams 33

PDF/A in a Nutshell 5
Table of Contents

From PDF to PDF/A: Converting PDFs to archive PDFs

PDF/A generation with Preflight 34

Converting PDF to PDF/A with pdfaPilot 37

Is this really a PDF/A file? PDF/A validation

Validation with Preflight 39

pdfaPilot PDF/A 41

Archive PDFs in everyday life: What issues might arise?

Images 42
Resolution is not part of the PDF/A standard 43
Permitted and prohibited compression types 43

Transparency 44
Illustrations: photocase.com/de

Colors 46

Fonts 48

Metadata 50
PDF/A and metadata 50

Accessibility 52
Creating an accessible PDF file from Word 54

Interactive PDF files 56


Comments and annotations 56
Forms 58
Embedding fonts for PDF/A forms 59

PDF/A for design drawings 60

Electronic signatures 61
Security levels 62
Digital signatures in PDF with Acrobat 63
Challenges in practice 64

6 PDF/A in a Nutshell
Table of Contents

The outlook: PDF/A in the future

Enhancements in PDF/A-2 65
Looking towards PDF/A-3 66
PDF/A-1 developments 66
PDF/A in one hundred years time 67

What the error messages mean

Preflight results and troubleshooting for PDF/A 68

Glossary

Explanation of terms relating to PDF/A 80

About:

Sepp Huberbauer – photocase.com/de


The PDF/A Competence Center 86

AIIM 87

PDF/A in a Nutshell 7
1. Durable documents with the
PDF/A standard
There are certain documents that people
want to keep because of their sentimental
value: Love letters, photographs of their
first day at school, or holiday snaps, for ex-
ticular document or photo on our comput-
ers. In addition, any possible space prob-
lems can be solved simply by purchasing
additional RAM. However, there are cer-
ample. Other documents have to be kept tain risks and uncertainties that might in-
for legal reasons. These document include fluence the shelf life of digital documents.
birth certificates, academic certificates and These risks do not only arise from the phys-
reports, invoices that are needed for tax ical durability of the data carriers used al-
purposes, insurance documents, and con- though it is clear that magnetic tape, CD-
tracts. ROMs, and DVDs will not necessarily last
In the days when everything existed on any longer than paper and ink. However,
paper – in the pre-digital era – the main photographic prints dating from 1900 still
problem was remembering which index exist today. Still, it’s debatable whether or
file, folder, or shoe box you’d used to store not we will similarly be able to view the
your letters or contracts. In today’s world of millions of digital snapshots being taken
digital documents, the task of archiving is and stored on mobile phone memory cards
fundamentally different. Thanks to search all over the world in, for example, 2107.
functions or database solutions, even the In addition to the restrictions imposed
most forgetful of us can easily find a par- by the limited lifetime of data carriers, the

Markus Imorde – photocase.com/de

8 PDF/A in a Nutshell
Durable documents with the PDF/A standard

document format and software used also displayed as required. Instead, the frame
present a considerable challenge for the du- where the image should appear displays
rability of electronic documents. Yester- only a rough preview of the image or a
day’s, today’s, and tomorrow’s software question mark. The problem of open files
It’s a common problem: Opening old for which not all illustrations and fonts
documents in brand-new programs are available has been causing irritating
doesn’t always work. The rate of success delays for printers and their suppliers for a
for the opposite direction (new documents long time. However, the introduction of
in old programs) is even less encouraging. PDF, a format that can store all the com-
Software developers do try to achieve ponents required for a printed document,
backward compatibility that enables files has greatly simplified work in this area. In
that are, say, five years old to be opened addition, layout files such as XPress or In-
using a current program release. However, Design are now becoming increasingly
this can change the layout and page ren- less common in printers’ archives. Instead,
dering, meaning that not everything is printers are storing the actual PDF docu-
displayed exactly as it ought to be. More ments that were used for the printing
recent software tends to generate docu- task.
TIFF-G4 – a black and white
ments with additional features that older
TIFF variant that works with a
versions may not be able to display. In TIFF as an archive format compression method devel-
some cases, it is not even possible to open For a long time, many public authorities oped for fax technology – is
current files in previous versions of a pro- and companies that need to store large commonly used for archiving.
gram. For example, whereas a Microsoft quantities of correspondence, records, in-
Word 95 file can normally be opened in voices, contracts, and similar information
Word 2003, it is not in digital archives
possible to open a have been using
Word 2003 document "The successful long-term the pixel image
in Word 95. archiving of digital files is at least format TIFF
Because software as threatened by the constant (Tagged Image File
production cycles are rollout of new program versions Format). This for-
becoming ever shorter mat digitalizes
as by damaged data or data
– one major release templates contain-
per year is not unusual
carriers." ing text and imag-
– the challenge that es pixel by pixel.
arises from new pro- TIFF is an estab-
gram developments is greater than that lished image file format that has both ad-
caused by the aging of storage media. The vantages and disadvantages. Pixel-based
successful long-term archiving of digital formats store the appearance of templates.
files is at least as threatened by the constant Problems with missing graphics and fonts
rollout of new program versions as by dam- do not occur, since the format stores all of
aged data or data carriers. the template elements as an image. Since
TIFF is widespread and is subject to few
Open files are not always complete file handling complications when upgrad-
File formats are not all equally suitable for ing to a new program version, many users
the long-term, secure archiving of content. believe that the future of the format is
If it is not possible to store all the elements guaranteed. However, while TIFF may in-
required for the complete display of con- deed be a de facto standard, it is not an
tent in a file format – graphics and fonts as official norm for safe archiving. Other dis-
well as text – then the possibility of stum- advantages include the relatively large file
bling blocks when it is attempted to use size and the fact that scanned texts cannot
the file later on cannot be ruled out. If, for be searched without OCR (text recogni-
example, the program used cannot find tion), since this format converts them to
linked external images, a page cannot be image elements.  ➔

PDF/A in a Nutshell 9
Durable documents with the PDF/A standard

PDF data containers ing illustrations or incorrect fonts – like


those that occur when Word documents
The development of PDF (Portable Docu- are opened on another computer – are not
ment Format), which Adobe Systems has usually a problem.
been promoting since 1993, has signifi-
cantly simplified data management and ■■ PDF is an open format. This means that
exchange for a great number of users from companies other than Adobe Systems (who
completely different fields. PDF allows invented PDF) can develop software for
obstacles that can arise during the trans- creating or displaying PDF. The ‘release’ of
mission or storage of files to be neatly PDF by Adobe has brought independence
avoided. for both users and developers and, as a re-
sult, there is a high probability that there
One of the reasons for the ■■ PDF files can be opened on all estab- will still be programs for generating and
global success of PDF must be lished operating systems. Free PDF readers displaying PDFs in decades to come.
considered to be the free are available for all of the important plat-
availability of the Adobe forms including Windows, Apple, Linux, So, can users who want to keep docu-
Reader. This PDF viewer is and mobile devices. ments such as contracts or invoices for long
available for download from periods of time trust in PDF to make sure
Adobe Systems’ Web site in ■■ With this format, the document layout that their documents will work just as well
many language versions for is true to the original. Since PDF can incor- in ten, fifteen, or one hundred years time as
numerous commonly used porate different types of content such as they do today? It might well be the case that
platforms:
text (and the relevant fonts), images, and PDF files created today will still work with-
www.adobe.com/products/
graphics, nasty surprises relating to miss- out any significant problems in 2017. How-
acrobat/readstep2.html

PDF specifications: Acrobat 4 (1999, PDF 1.3):  PDF 1.3 contains the com-
plete PostScript Level 3 graphics model. It enables multi-
Since it was introduced at the start of the 1990s, the PDF channel color spaces (DeviceN) and supports ICC profiles
file format has been in a state of constant development. for the reliable reproduction of colors. It introduces
The current PDF specification is version 1.7, which was smooth shades and page geometry boxes, which are use-
introduced with Acrobat 8. Today, it is extremely rare to ful for prepress processes (TrimBox, CropBox, and Bleed-
come across PDF files with a version number lower than Box).
1.3, and modern PDF generation programs only have
backward compatibility to version 1.3 at the most. Acrobat 5 (2001, PDF 1.4):  From this version, PDF
files can contain transparency. This version also intro-
With each PDF version, Adobe Systems publishes a refer- duces ‘tagged PDF’ (= structured PDF), which enables
ence that describes the features and functions of the ver- content accessibility. The security options are enhanced
sion in detail. The specification history contains ‘mile- with this version. In addition, the image compression
stones’ – important features that were introduced with type JBIG2 is supported.
the new version. Some of these milestones are listed be-
low. Acrobat 6 (2003, PDF 1.5):  With this version, PDF
documents can contain layers (also called ‘optional con-
Acrobat 1 (1993, PDF 1.0):  PDF 1.0 incorporates most tent’). JPEG2000 image compression is supported.
of the functions offered by the page description language
PostScript Level 2. All basic functions for text, vector Acrobat 7 (2004, PDF 1.6):  This version supports
graphics, and raster graphics are available. OpenType fonts. With this version, 3D content can be in-
serted. Users can create virtual page sizes with edges of
Acrobat 2 (1994, PDF 1.1):  This version supports the up to 381 km in length.
Lab color space and CalRGB. It also supports TrueType
fonts. Acrobat 8 (2006, PDF 1.7):  Unicode path specifica-
tions simplify the correct specification of links, even
Acrobat 3 (1996, PDF 1.2):  This version enables across international language systems. The new Acrobat
color separation and supports Unicode and CID fonts ‘PDF packages’ function allows several independent PDF
(Chinese, Japanese, and Korean). It also supports ZIP documents to be forwarded in a single file. The recipient
compression. requires Acrobat or Reader 8.

10 PDF/A in a Nutshell
Durable documents with the PDF/A standard

ever, only the new PDF/A standard can users from repeatedly having to test and
guarantee that users will be able to view ex- discuss the best appearance of a well-
actly the same content as when their docu- functioning archive PDF, industry experts
ments were created. This format brings the decided in 2002 to work together to de-
kind of legal certainty that can be decisive velop the PDF/A standard.
in many business and administrative con-
texts. The introduction of the PDF/A standard
The PDF/A standard for long-term ar-
Why PDF/A and not PDF? chiving was adopted by ISO (International
Why has a special PDF standard now been Organization for Standardization) in au-
defined for archiving documents? Are tra- tumn 2005. The PDF/A standard was pub-
ditional PDF documents not ‘good enough’ lished with the number ‘ISO 19005-1:2005’
for long-term archiving? PDF has some and is based on PDF specification 1.4. An
excellent characteristics that lend them- additional part, PDF/A-2, is currently be-
selves to the creation of archive docu- ing prepared. This part shall refer to PDF ISO is an international organization for
ments. Like a container, a PDF can incor- Version 1.7. standardization, active primarily in tech-
porate completely different elements such The PDF/A standard aims to enable the nical and electronic fields. The PDF/A stan-
dard was developed by industry and de-
as text, images, and fonts. In addition, it creation of PDF documents whose visual
velopment experts.
reproduces layouts that are true to the appearance will remain the same over the
original and is cross-platform capable. course of time. These files should be soft-
However, certain requirements must be ware-independent and unrestricted by the
met in order to enable the exact reproduc- systems used to create, store, and repro-
tion of content. duce them. As far as PDF/A is concerned,
practice soon caught up with theory.
■■ Required: One ‘must’ is that users re- While Acrobat Professional 7 contained PDF/A
quire full access to all elements belong- only ‘draft’ PDF/A functions, Acrobat 8, Competence Center
ing to a document. For example, fonts which has been available since the end of
must be embedded – a link to the font in 2006, now offers creation and verification
question is not sufficient. This means features that comply with the adopted International companies and experts from
that if, in 10 years time, a user who tries standard. the field of PDF technology have joined
to open a document does not have a re- Many new PDF/A tools and solutions forces to form the PDF/A Competence Cen-
ter. It aims to promote the exchange of in-
quired font on his or her computer, spe- for creating and verifying files have en-
formation and experiences relating to
cial characters or symbols will not be dis- tered the market since the introduction of long-term archiving. Users can visit
played correctly. the standard – from ‘small’ tools for indi- www.pdfa.org for up-to-date advice and
vidual users who want to create PDF/A background information as well as a dis-
■■ Prohibited: In addition, some PDF fea- documents every now and again to exten- cussion forum on PDF/A.
tures must be avoided. Such elements are sive server solutions that can create a hun-
prohibited because they would undermine dred thousand archive documents from
the required document durability, and in- databases in just a few hours time.  ➔
clude interactive elements and PDF layers.
These features inhibit the unambiguity that PDF/A has two levels of compliance:
is required from an effective PDF/A file.
For example, in the case of a PDF docu- PDF/A-1a (Level A) applies to semantic correctness
ment with layers, users printing it out in 50 and structure. Each character must have a Unicode
years time might well ask themselves which equivalent. The structure is expressed by tags.
layers are valid and which are not. This
kind of decision needs to be made now – PDF/A-1b (Level B) applies to visual integrity.
when the PDF is created.
Any file that meets the requirements for PDF/A-1a will
A PDF/A document is basically a tradi- also comply with PDF/A-1b, which is less strict.
tional PDF document that fulfills precisely
defined specifications. In order to prevent

PDF/A in a Nutshell 11
Durable documents with the PDF/A standard

generated from working files such as Word


or PowerPoint files.

■■ PDF/A files from files or data: This field


relates to newly created PDF files from ap-
plications including word processing, im-
age editing, and layout programs. The pro-
Converting PDF to PDF/A: Acrobat 8 Pro- cess can be realized by means of a PDF ex-
fessional provides an export function for port from the source program, Acrobat
PDF/A in addition to other formats. Professional, Distiller, or other PDF con-
verters. For the mass conversion of content
to PDF/A, there are program modules that
can convert database content or print data
streams to the PDF/A format.

■■ Converting scanned paper docu-


ments to PDF/A: Often, documents that
exist only on paper, such as contracts, in-
voices, and books, need to be digitalized
using a scanner. Over the past years, the re-
sults of the scanning process have usually
been stored as Bitmap TIFFs. However,

How to create archive PDF is increasingly being used for scanned


documents, and before long the majority of

PDFs scanned files will probably be stored di-


rectly as PDF/A files. For example, users
There are many different conditions that can scan paper documents using Acrobat 8
might be encountered when creating PDF/A Professional and save them as PDF/A files.
files. The process differs depending on It is often possible to make the text in a
whether existing PDF documents are al- PDF/A file searchable using the text recog-
ready available or whether they need to be nition function (OCR). Images and histori-

Converting paper documents to PDF/A:


Scanned documents can be automatically
given searchable text following digitaliza-
tion. Text recognition software is used for
this.
Dirk Herold – photocase.com/de

12 PDF/A in a Nutshell
Durable documents with the PDF/A standard

cal documents can also be scanned for con- he or she says that it’s a PDF/A file? Before
version to PDF/A. Solutions and services received files are saved in an archive, they
for mass processing are available for users must be checked to make sure that they are
who wish to scan a large number of pages PDF/A-compliant. There are various tools
or documents. that enable file verification: In addition to
Acrobat 8 Professional, there are other ap-
■■ Creating PDF/A from PDF: Many users plications including Berlin-based callas
already have PDF documents that are not software’s pdfaPilot, which enables the veri-
PDF/A-compliant. It is often not possible to fication and creation of PDF/A files as well
recreate such documents from the source as providing some additional functions.
program because, for example, they were
not created locally but were sent to the user Who stands to benefit from PDF/A?
in question by e-mail. There are several Many sectors and professions have been
methods for converting PDFs to PDF/As. waiting for a PDF standard for archiving. It
Acrobat 8 Professional is one of the appli- is useful not only for archives, administra-
cations that can be used. However, Adobe tive departments, industry, and commerce
is not the only company to market software but also for research and teaching. Many
for this particular task. There are many dif- different types of content can be saved as
ferent products on the market, ranging PDF/A files. Below are a few randomly se-
from single-user solutions to systems for lected examples from various fields.
high throughput.
■■ Saving e-mails as PDF/A: Today, more
■■ Is this really a PDF/A file? When work- and more correspondence, some of it of a
ing with PDF/A on a daily basis, file verifi- contractual nature, is being sent by e-mail.
cation is also important. Is it sensible to Anyone who has switched from one e-mail
believe the sender of a PDF document when program to another knows the difficulties
involved in transferring old
mail to the new system. Since
PDF/A is a safe format, it
makes sense to save e-mail ar-
chives on back-up media in
the form of PDF/A at regular
intervals.

■■ Saving brochures, manu-


als, and information sheets
as PDF/A: Many companies
and public authorities already
make a large quantity of in-
formation available in the
form of PDF downloads. Why
not create these documents in
the future-proof PDF/A for-
mat straight away and distrib-
ute them as PDF/As? ➔

PDF/A validation with Preflight: The Preflight validation and correction tool is part of Ac-
robat 8 Professional. It generates PDF/A files and checks existing PDF/A documents to
make sure that they comply with the standard.

PDF/A in a Nutshell 13
Durable documents with the PDF/A standard

■■ Accessible PDF files: In America, ac-


cessibility in the digital world has been an
issue for a long time – especially for the In-
ternet. Enabling the accessibility of infor-
mation to visually impaired members of
society is now also on the agenda in Eu-
rope. Since PDF/A specifically supports
structured content in PDF documents, it is
ideal for processing accessible PDF docu-
ments that can be read out by screen read-
ers.

■■ Storing print documents as PDF/A:


Printers and prepress companies will be
relieved to hear that the PDF/X standard
that is widespread in their industry sec-
tors is completely compatible with the new
PDF/A standard. A PDF document can be
simultaneously PDF/X and PDF/A-com-
pliant.

Although the PDF/A standard was in-


tended for long-term archiving, it also has
advantages that can be appreciated imme-
Design drawings: Historical, hand-drawn ■■ Plans, maps, and design drawings: diately. Anyone who passes information
plans can survive for centuries. A standard Digital maps, architectural drawings, and and documents on to customers, readers,
such as PDF/A is required to make sure construction plans can also be used by fu- or partners can benefit from PDF/A.
that digital plans are also viable in the
ture generations if their creator uses PDF/A guarantees data consistency.
long term.
PDF/A. The law stipulates that certain en- This rules out problems such as non-em-
gineering plans, such as construction bedded fonts, which can result in gobble-
plans for bridges, must be kept for several degook. Color management functions
decades. mend images that are too pale or too
bright. In addition, PDF/A prevents many
■■ Signed digital contracts: An increas- processing problems that can occur with
ing amount of business correspondence is password-protected PDF documents or
sent electronically. PDF/A documents can when printing files.
be digitally signed to enable legally effec-
tive contracts to be concluded using only
digital means.

■■ Correct colors in image documents:


PDF/A also enables the accurate display of
colors, an important advantage when Font problems caused by non-embedded fonts: PDF/A prevents
working with digital image data. For ex- problems with fonts such as the inconsistent tracking illustrated in
ample, if a museum or gallery already has the second line above.
digital images of exhibits, they can be
converted to PDF/A files and benefit from Consequently, people who use PDF/A are
the advantages of the archiving format. not only doing themselves a favor – they
Digital imaging procedures are becoming are also helping file recipients, since PDF/A
increasingly important in medical diag- eliminates many of the problems that can
nostics, meaning that PDF/A could be otherwise be encountered right from the
useful in this field, too. start. n

14 PDF/A in a Nutshell
Durable documents with the PDF/A standard

Table: Comparison between PDF/A-1a and PDF/A-1b


ISO has split the PDF/A standard into two code, and stipulates that the content struc-
levels of conformity. PDF/A-1a (Level A) ture must be correct. PDF/A-1b-compli-
requires documents to be capable of being ance (Level B) only requires the reproduc-
reproduced without visual ambiguity, de- tion of documents without visual ambigu-
mands that text can be depicted in Uni- ity.

ISO 19005-1:2005: PDF/A- ISO 19005-1:2005: PDF/A-


1a (Level A) 1b (Level B)
Conformance Complete PDF/A conformance Restricted PDF/A conformance
Aim To produce archive PDFs with To produce archive PDFs that
full access to all content only guarantee visual reproduc-
ibility
PDF version PDF 1.4
PDF/A identification Users are obliged to state the PDF/A identifier and the level of
compliance
Metadata Specifications such as author, document title, creation data, and
source program must be XMP-compliant
Logical structure Structure and accessibility must There are no explicit logical
be realized using tags and alter- structure requirements
nate image descriptions and by
stating the language used
Encryption Security settings are prohibited. It must be possible to open/pro-
cess the PDF file in question without requiring a password.
Colors All colors must be identified. Device-dependent color spaces must
be identified by output intent.
Transparency Not permitted
PDF layers Not permitted
Compression LZW compression not permitted
JPEG2000 compression not permitted
Fonts All fonts must (at least as subsets) be available (embedded) direct-
ly in the PDF document in question
Mapping of character codes to glyphs must be unambiguous
Each letter must have a Unicode –
equivalent

Annotations Comments that take the form of sound or movies are not permit-
ted.
Traditional text/label-style annotations are permitted.
Referenced content Referenced (non-embedded) images or page content are not per-
mitted
Alternate images Alternate images (for lower-resolution screen display) are not per-
mitted
Programming Embedded JavaScript is not permitted
languages
Actions Certain actions, such as opening movies or sound files or sending
or resetting forms, are not permitted
Forms Permitted, but with restrictions

PDF/A in a Nutshell 15
Durable documents with the PDF/A standard

Overview: Which file formats are suitable for


archiving?
How are documents normally archived in features based on fax technology. The JPEG
private home offices, company offices, pub- image format is commonly used for color
lic authorities, and organizations? documents. This format produces relatively
Many users simply store the original small file sizes in spite of the colors. The
documents (for example, Word, Excel, or PDF/A standard and XPS (XML Paper
PowerPoint files). This can lead to nasty Specification), which was developed at the
surprises with regard to reliable display- end of 2006 by Microsoft, are relatively new
ability and the future usability of the files. formats. Both of these formats offer fac-
This method of archiving is therefore not simile quality as well as supporting struc-
recommended. tured content, thereby allowing the the re-
TIFF-G4 has been the de facto standard liable and complete indexing of text. PDF/A
for numerous companies and administra- is already standardized, but XPS still has to
tive departments for many years. TIFF-G4 stand the test of time.
files are monochrome, black and white The table below gives an overview of the
TIFFs (bitmaps) that can be archived in a long-term archiving features offered by
way that saves space thanks to compression both these formats.

PDF/A XPS TIFF-G4 JPEG DOC (Word)

96 KB 112 KB 56 KB 88 KB 120 KB

16 PDF/A in a Nutshell
Durable documents with the PDF/A standard

PDF/A XPS TIFF-G4 JPEG DOC (Word)


ISO standard for ✔ Yes ✘ N o ● D e facto stan- ✘ N o ✘ N o
archiving dard, but not an
official standard
Facsimile quality ✔ Yes ✔ Yes ✔ Yes ✔ Yes ✘ N o
Font security ✔ Yes ✔ Yes ● N o fonts exist, ● N o fonts exist, ✘ N o
Maximum level Fonts are embed- since the files are since the files are The display of
of security ded pixel images. pixel images. fonts can vary on
thanks to strictly different com-
defined specifi- puters. Users are
cations in the not told about
PDF/A standard missing fonts.
Users are not told
about replaced
fonts.
Searchable text ✔ Yes ✔ Yes ● Possible if ✘ N o ✔ Yes
Enabled by searchable text is
means of OCR, generated using
even for text on OCR. There is no
scanned pages standard proce-
dure for storing
the text in the
TIFF-G4 file.
Consistent colors ✔ Yes ● Possible ✘ N o ● Possible ✘ N o
Consistent colors Produces black
are required by and white bit-
the standard maps
Images and  es
✔Y ✔ Yes ✔ Yes ✔ Yes ✘ N o
graphics are They are incorpo- They are incorpo- Cannot always be
fixed document rated into the rated into the handled safely
parts pixel image pixel image
Structured data ✔ Yes – with PDF/A- ✔ Y
 es ✘ N o ✘ N o ● Possible
1a with tagged With XML
PDF
Cross-platform ✔ Yes ✘ N o ✔ Yes ✔ Yes ●O
 nly with restric-
capable tions (font prob-
lems)
Free viewer ✔ Yes ✔ Yes ✔ Yes ✔ Yes ✔ Yes
PDF/A is always (currently only (but not widely Most Internet There are free
displayed in ex- for Windows) used) browsers can dis- alternatives to
actly the same play JPEGs Office but docu-
way ments may be
displayed differ-
ently

What’s to be done with JPEG and TIFF-G4 archives? PDF/A is worthwhile. This involves using
There are basically two options for convert- mass conversion solutions that package
ing large document archives that currently pixel information into PDF and can enable
use TIFF-G4 or JPEG to PDF/A along with text searching using text recognition.
their existing inventory: Permanently or However, if users only need to call up
temporarily. data from an archive every now and then,
If the number of documents handled is ‘on the fly’ solutions can be used to gener-
not too high and regular access to the data ate a PDF/A file from a particular original
is required, converting the image files to image file. n

PDF/A in a Nutshell 17
Is XPS an alternative to PDF/A?

Is XPS an alternative to PDF can also be used as a ‘spool’ for-


mat. The Californian company Apple has
PDF/A? been using it as such for years for its Mac
OS X operating system. However, for a
Since the end of 2006 and especially in long time PDF has not been used just as
Microsoft circles, it has been argued that the final stop for static digital documents
XPS might be a future alternative to both – instead, it has become the basis of docu-
PDF and PDF/A. However, it is still neces- ment-based process flows.
sary to clarify whether or not XPS can of- The best-known usage of PDF in this
fer at least the same level of technical ex- context is for electronic forms; recently,
pertise as PDF/A. If it can do so, its special PDF has increasingly been used for col-
features must also be defined. lecting information with sophisticated
For the moment, let’s put aside the fact comment functions that can be used by
that PDF has existed since 1993 and that multiple users for PDF documents with
PDF/A was adopted as an ISO standard in ensured traceability. The many established
autumn 2005, whereas XPS has been avail- possibilities offered by digital signing
able since the end of 2006/start of 2007 as round off the use of PDF in various docu-
Documents can be ‘published as PDF or a part of Microsoft’s Vista operating sys- ment workflows, even including certified
XPS’ from Office programs. tem. Instead, let’s look at the nature of and qualified digital signatures (cf. the
PDF / PDF/A and XPS. XPS can be de- section on digital signatures on page 61).
scribed as a format that statically depicts All in all, PDF builds a bridge between
the content that a monitor would or should the different electronic document formats
display or a print device would or should (even if no PDF export function is avail-
print out. In this respect, it could be called able, anything that can be printed out can
an up-to-date ‘spool’ format, comparable also be ‘printed’ to a PDF) while also be-
with PCL, AFP, or PostScript but more ing a document format in its own right
The current version of Internet Explorer modern. On Microsoft Vista at least, users with its own, separate fields of applica-
also serves as an XPS viewer (as yet, only can display a document in XPS spool for- tion.
on Windows). mat in Internet Explorer. As far as forms, comments, and signa-
tures in PDF documents are concerned,
files containing such elements can be ad-
equately archived as PDF/A. In this re-
spect, both PDF and the PDF/A ISO
19005-1 go far beyond the conceptual and
actual features of XPS.

Comparison of XPS and PDF/A


Even in the subarea where PDF/A and XPS
can be directly compared with each other,
weaknesses and restrictions for XPS
quickly emerge, making it unlikely that
XPS will ever complete usurp the position
of PDF/A. Some examples are presented
below.
One of the first things to point out is
that the graphics models of PDF and XPS
are very similar. However, since XPS
lacks certain constructs it can be difficult
or even impossible to convert PDF docu-
ments to XPS without losing data (con-
version from XPS to PDF is always pos-
sible, thanks to the greater scope of func-

18 PDF/A in a Nutshell
Is XPS an alternative to PDF/A?

tions offered by PDF). Below is a detailed


list of XPS properties in comparison to
PDF/A:

■■ It does not support an overprinting


function.

■■ User-defined, colorable image masks


are not permitted.

■■ The JBIG2 compression procedure,


which is an extremely powerful procedure
for scanned documents, is not supported.

■■ Multi-channel color spaces with more


than eight components are not permitted.

■■ JPEG2000 (to be supported from


PDF/A-2) is also missing from XPS’s range
of features.

■■ XPS only has one transparency blend-


ing mode – PDF has 16 (but note that trans-
parency is only permitted in PDF/A docu- unfortunately casts doubt upon the appli- The Print function can be used to generate
ments from PDF/A-2). cations claim to be a serious publishing XPS files from any application (including
tool. Users do not even see a warning, Photoshop or CorelDraw), but the quality
is less than perfect.
■■ Finally, XPS lacks the bookmarks, notes, which, considering the ‘high-quality print-
and comments that have now become im- ing’ XPS export style, must be considered
portant for so many users. as misguiding at the very least. Microsoft’s
own PDF export process suffers from the
■■ On the other hand, PDF/A does not sup- same restriction.
port Windows Media Photo format or the It’s no secret that the performance of
corresponding image compression proce- XPS, however impressive, can only be guar-
dure, but this is not surprising since it was anteed if the program from which a file is
only released by Microsoft in the third being saved in XPS format expressly sup-
quarter of 2006. In any case, XPS Specifica- ports XPS (or the Windows Presentation
tion 1.0 of October 18th 2006 still talks Foundation architecture).
about Windows Media Photo even though Over the course of time, many applica-
Microsoft has now renamed the format as tions are certain to implement such sup-
HD Photo. This indicates that the format is port, since otherwise they will eventually
still not entirely consolidated. be unable to claim that they support Vista.
However, it will take years for the required
From Microsoft’s point of view, this fact updates to find their way onto Windows
is probably considered to be a plus rather computers, and there will probably be a
than a disadvantage. At the end of the day, whole range of applications that will either
Microsoft applications can make do with not support XPS at all or will only intro-
what XPS can display. duce XPS support later on. If a program
However, even in this respect Microsoft does not expressly support XPS, it is still
could be accused of not putting all of its ef- possible to create an XPS document on Vis-
forts into XPS: Although Publisher 2007 ta using the XPS printer driver. This in-
permits the use of CMYK and spot colors, volves a transformation of the original GDI
it saves them in XPS format as RGB, which protocol into XPS notation. ➔

PDF/A in a Nutshell 19
Is XPS an alternative to PDF/A?

Adobe Acrobat 8 can convert XPS to PDF.


However, this only works on Windows.

With applications that use PostScript for main a format that, while being extreme-
high-quality printing – including all pro- ly well-suited to depicting typical Office
fessional publishing applications such as documents, cannot depict other docu-
Adobe PageMaker, Quark XPress, Corel- ment types or can only store them with
Draw, and Adobe Photoshop – the user is unnecessary restrictions. In this respect,
confronted with nothing more than a XPS does not offer the universal support
crutch: He or she ends up with a file export of all document types that has made PDF
Acrobat can handle a whole range of doc-
that has the quality of a screenshot. such a powerful and popular format.
ument formats. Users can now use the Even if Microsoft and third-party sup- The best thing about XPS is therefore
‘Open...’ command in Acrobat 8 to open pliers manage to iron out some of these that it can be used to create a much higher
XPS documents as PDF files. issues during coming years, XPS will re- quality PDF from applications that support
it than previous methods such as GDI,
PCL, or PostScript printer drivers. From
Version 8, Adobe Acrobat offers an import
filter for XPS. It enables users to easily open
an XPS and simply save it as a PDF.
There is, of course, a suspicion that Mi-
crosoft intends to overstretch the capabili-
ties of XPS by using its market power to
position the format not just as a spool for-
mat or printer language but also as a uni-
versal exchange format – but it certainly
doesn’t fulfill the requirements for the lat-
ter as well as PDF. PDF should remain the
more reliable format for many years to
come. n

20 PDF/A in a Nutshell
PDF/A creation: Analog, digi-
tal, and mass processing
PDF/A is always the destination, but the
point of departure can differ greatly from
user to user. This chapter concentrates on
three main tasks: Converting paper docu-
themselves. For example, telephone bills
are often sent as printouts by mail. In some
cases, documents that need to be retained
are only available as printouts because the
2.
ments to PDF/A, exporting Microsoft Of- digital originals have been deleted from
fice files and other documents in a way that users’ computers. In addition, many docu-
allows them to be archived, and mass pro- ments were created by typewriter or by
cessing PDF archive files. The special pro- hand in the days before computerization.
cess flow for converting existing PDF files In such cases, the only way to digitize
to PDF/A is explained in detail later on. document pages is by using a document
scanner. As well as the type of scanner (flat-
bed scanner or a device with bypass feed),
PDF/A from scanned the scope of features provided by the soft-
ware also has a bearing on whether or not
documents the digitalization process can create a fault-
‘Analog to digital’ conversion is normally less PDF/A and dictates which additional
required when users have received the features can be used to enhance the usabil-
documents that need to be archived as ity of the document (for example, OCR for
printed pages rather than creating them full-text searching). ➔

Roufoto – photocase.com/de

PDF/A in a Nutshell 21
PDF/A creation

grams in this publication; we are therefore


going to concentrate on Acrobat Profession-
al’s Scan To PDF function, available from
Version 8 for the production of standard-
compliant PDF/A documents.

Scanning options in Acrobat 8 Professional


In the majority of cases, users can process
documents with Adobe Acrobat Profes-
sional as well as with the original software
regardless of which scanner they use. This
application enables the production of PDF
files for completely different requirements
and diverse uses. From Version 8, there is a
checkbox that allows you to select PDF/A
compliance.

Important settings
Once the scanner is connected up and
switched on, the user can trigger the cre-
ation of a PDF in Acrobat by choosing ‘File’
→ ‘Create PDF’ → ‘From Scanner’ in the
menu bar. In the dialog box that then ap-
pears, the scanner being used can be selected
from the list of devices and the user can de-
fine whether the application is to scan only
the front side of the document or both the
front and back sides. In the ‘Output’ area,
Acrobat Scan: Users can use a checkbox Incidentally, all modern scanners support users can decide whether the current scan
to define that a digitalized PDF docu- the use of PDF as the initial format (in addi- process should generate a new PDF docu-
ment is to be standardized in line with tion to image formats such as JPEG or TIFF). ment or append the scanned material to an
PDF/A. If required OCR (text recognition),
However, not all scanners are currently able existing PDF. The ‘Make PDF/A Compliant’
accessibility, and metadata options can
be activated. to generate PDF/A. This restriction is sure to checkbox is especially useful here, and
change as the standard becomes more prev- should be selected. Quality settings for the
alent. Due to space restrictions, it is impos- PDF document can be made using a slider or
sible to mention all established scan pro- in more detail using the ‘Options’ button.

Scanning and creating PDFs with


Acrobat 8: PDF files can also be produced
from scanned pages with Adobe software.

22 PDF/A in a Nutshell
PDF/A creation

Text recognition, accessibility, and meta- 1a-compliance, since errors can still occur
data functions can be used to give the new when structures are reconstructed. This is
PDF additional features. why the restricted version, PDF/A-1b, is
The text recognition function creates used here, too.
searchable text (otherwise, the scanned
Converting pages that have already been
scanned to PDF/A
The procedure used to convert scanned
documents that already exist in the form of
pixel data to PDF documents in Acrobat
Text Recognition and Metadata: These options give the PDF addi-
tional features such as searchable text and metadata. These fea-
tures are not PDF/A-relevant, but they do enhance the functionality
of the PDF file.

pages in the PDF simply take the form of


pixels). Accessibility (enabling access to
content for the visually impaired) gives
the PDF structural information that de-
scribes the order in which it is to be read.
This is required to optimize the use of
screen readers. Metadata can be consid-
ered as a kind of digital label that contains
information such as the title of the docu-
ment, copyright, keywords, and author.
This is useful for administrating digital
archives.

Detailed description of text recognition


Additional settings for text recognition can
be made using the ‘Options’ button. This

Professional is rather different. First, the Document optimization and text recogni-
image file (TIFF or JPEG) is imported by tion: The ‘Optimize Scanned PDF’ function
choosing ‘File’ → ‘Create PDF’ → ‘From File’ can be used to enhance the source materi-
al for text recognition, e.g. by removing
from the menu. It is then converted to a
edge shadows. Following this process, the
PDF file. The ‘Document’ menu contains user can use Adobe’s ‘Recognize Text Using
the ‘Optimize Scanned PDF’ function. OCR’ function to generate searchable text.
Once the document has been converted
into a PDF, the user can use this function
Recognize Text – Settings: The ‘PDF Output Style’ field contains op- to improve it before subjecting it to text
tions for generating a simple PDF image with searchable text or a
recognition.
more complex PDF file with separate areas for text and graphics
where possible. Text recognition is also called from the
‘Document’ menu item. It is triggered with
area includes a language setting and other the ‘OCR Text Recognition’ → ‘Recognize
fine-tuning settings for text recognition. Text Using OCR’ command.
For example, the user can define whether The user can then check that the pro-
the scan output should be a searchable im- cess worked correctly: Clicking ‘Find All
age or formatted text with graphics. How- OCR Suspects’ triggers a search for im-
ever, note the following: Even the second, age elements that could not be converted
superior option is no guarantee of PDF/A- to text. ➔

PDF/A in a Nutshell 23
PDF/A creation

Saving or exporting the document as a PDF/A whether the user chooses the Export op-
The PDF document must now be converted tion or ‘Save As’, only the PDF/A-1b level in
to PDF/A. This can be achieved in just a few the ‘Settings’ will be successful.
steps using the Export function or with the Even after text recognition, metadata in-
‘Save As’ command. Both methods involve put, and the integration of structural infor-
mation for accessibility, scanned docu-
ments do not automatically have advanced
PDF/A-1a features.
When the user clicks ‘OK’, Acrobat gen-
erates a PDF/A file from the PDF docu-
ment. n

A multitude of formats for saving documents: This long list of for-


mats includes the PDF/A standard.

the use of the integrated Acrobat Profes- Scanned documents are always converted to PDF/A-1b: To ensure a
sional Preflight engine, which carries out successful PDF/A conversion, the preset PDF/A-1b-compliance speci-
the conversion to PDF/A. Regardless of fication must not be changed.

Saving space when generating PDF files from


scanned documents: red text
and
The generation of PDF files from digitalized paper documents has a disadvan- blue text
tage – the image data for such files normally requires more memory capacity
than digital pages of text. A PDF generated from a Word document will be
considerably smaller than a PDF file that is generated from a Word printout
using a scanner. re
an d text
blued
This comparatively high file size is particularly cumbersome if a large number text
of documents with many pages need to be archived. There is a big difference
between 10,000 x 40 KB and 10,000 x 400 KB: 400 MB will still fit on a CD-ROM;
4 GB will not.

An important factor for determining the size of PDF files is whether the docu-
ment is read in black and white (line scan), grayscale, or in color – color data
consists of much more information than bitonal data and the resulting data ment of PDF/A, LuraTech has enhanced the product and service scope of scan-
quantity is therefore also larger. to-image and scan-to-PDF solutions by adding a scan-to-PDF/A function. The
JBIG2 compression used has been improved by a type of layer technology that
Various image compression types have been developed over the past years to enables color documents to be digitalized in a legible manner while using rela-
enable users to save memory space when storing image data. The best known tively low amounts of memory.
of these methods is JPEG compression. PDF/A permits compression, but not all
types. JPEG and JBIG2 are permitted, but JPEG2000 is not. In addition to the In addition to compression, there are various text
type of compression, the compression level is also important for a scanned recognition functions and options for integrating
text. This is for readability reasons. Higher compression levels can render the metadata into PDF/A files.
image/text progressively less clearly.
More information on the Internet at:
Berlin-based LuraTech has been working on effective image compression for www.luratech.com
digitalized company documents for years. During the course of the develop-

24 PDF/A in a Nutshell
PDF/A creation

The Distiller engine combination of print data and the Dis-


tiller is an all-purpose method. In addi-
For a long time, the Distiller was the only tion, the related format EPS (Encapsu-
recommended way of producing faultless lated PostScript) can also be directly ‘dis-
PDF files for tasks such as professional tilled’.
printing from certain programs. In the When is the Distiller useful? The Dis-
light of constantly improving PDF creation tiller is the appropriate tool for creating
functions such as those used in new ver- PDFs from applications that do not offer a
sions of widely used programs like Micro- PDF export function or an option for sav-
soft Office and InDesign or at operating- ing files as PDFs. However, the Distiller
system level (for instance, Mac OS X), the has more features. Watched folders enable
Distiller is losing some of its importance. the creation of PDFs to be automated and Acrobat Distiller 8: The Distiller is a
However, it is still an important component standardized – a useful feature for many program that is contained in the Acrobat
of the Acrobat package. usage environments. package. Even Acrobat 1 was shipped with
a Distiller. Reliable PDF/A creation func-
The Distiller uses a slightly specious
tions were introduced with Distiller 8.
method to convert various file formats into PDF/A document generation using the
the PDF format: It uses the PostScript for- Distiller
mat that is generated temporarily when
printing files. Because PostScript and PDF With Acrobat 8, Adobe has implemented PostScript and EPS files can
are related to each other both with regard to presettings for standard-compliant con- also be converted to PDF by
development and on a structural level, con- version to PDF/A. The Distiller can only dragging them into the Acro-
version from PostScript to PDF is normally be used to create PDF/A-1b-compliant bat Distiller window or onto
easy to achieve – as long as the appropriate documents; PDF/A-1a documents cannot the Distiller icon. This means
printer drivers and a PostScript to PDF con- be created for technical reasons, since the that there is no need to use the
verter, like Adobe Distiller, are available. required structural information is not ‘Open...’ command.
Since all popular programs have print generated or passed using the PostScript
functions, the generation of PDFs using a method. ➔

PDF/A using Distiller 8: There are two


PDF/A settings, each with a different color
space. This enables the creation of PDF/A
files in RGB (for displaying on monitors)
and in CMYK (for printing).

PDF/A in a Nutshell 25
PDF/A creation

PDF/A settings: Users can change the com- Users can choose between two default PDF/A settings in detail
pression level and resolution in the ‘Imag- settings in the main window of Acrobat Changes to the preset default settings for
es’ section. The compression type can be Distiller. PDF/A in color space RGB is the generation of PDF/A should only be
changed to ‘ZIP’. The preset sRGB output
mainly suited for use on computer screens. made after due consideration to avoid
intent in the Standards section is the gen-
erally recommended intent for RGB. In the CMYK PDF/A is intended for printing out creating non-compliant documents.
case of CMYK, the preset US profile can be with either an office printer or with pro- These settings can be modified by choos-
changed to a profile more suited to the Eu- fessional four color printing on an offset ing ‘Settings’ → ‘Edit Adobe PDF Set-
ropean market. printer. tings’.
Settings that influence the resolution
and compression of images can be made
in the ‘Images’ section. Files with lower
resolution and higher compression values
are smaller, but this can worsen the dis-
play quality. However, the compression
type can be changed to ZIP, which does
not impair the image quality.
When creating PDF/A with the CMYK
color space, European users should take a
look at the ‘Standards’ section. The Out-
put Intent presetting here is intended for
the US market. (The term ‘output intent’
comes from the color management field
and refers to the regulation of color set-
tings for printing.) In this area, users can
select an output intent that is more suited
for use in Europe, such as the European
ICC profile ‘ISO Coated FOGRA27’, which
is contained in the Acrobat 8 scope of de-
livery.
If a change is made to a default profile,
the changed profile is saved as a copy; Dis-
tiller default settings cannot be overwrit-
ten.
lio – photocase.com/de

26 PDF/A in a Nutshell
PDF/A creation

Additional throughput with watched folders


PDF settings (the settings that specify how
PDF files are to be ‘distilled’) can also be
appended to file-system folders. Such fold-
ers are called ‘watched folders’ or ‘hot
folders’.
Hot folders can be set up in just a few
steps in the Distiller. The user specifies
which file-system folders are to be
watched, selects the required post pro-
cessing setting – in this case, the setting
for PDF/A – and the Distiller then creates
two new folders (‘In’ and ‘Out’) in each
hot folder, as well as a ‘joboptions’ in-
struction file.
If a print file is now saved in this folder,
the job specifications are implemented au-
tomatically without user intervention. Us-
ers can save multiple documents in a
watched folder for processing. In addition
to the fact that this process can be carried
out automatically, another advantage is
that the quality of the files remains con-
stant.
However, as far as licensing is con-
cerned, the Distiller is not intended to be
used to enable entire departments to ac- nies. This high-throughput solution has Watched folders: The Distiller enables
cess watched folders on the server. Adobe now acquired a different naming and is watched folders to be set up. Why not cre-
markets a server version of the Distiller marketed as the ‘LiveCycle PDF Generator ate separate ‘hot folders’ for PDF/A (RGB)
and PDF/A (CMYK) in order to create PDF
for the mass creation of PDFs in compa- PostScript’. n
files more effectively?

In and Out: PostScript print files are sent to


the ‘In’ folder. The Distiller processes each
file in accordance with the settings in the
job options (in this case, the setting for
PDF/A files in the RGB color space). The
new PDF/A files are sent to the ‘Out’ folder.
The process creates log files containing in-
formation on the process flow.

PDF/A in a Nutshell 27
PDF/A creation

Only available as an add-in: Users can


only benefit from the function that en-
ables documents to be published as PDF
files in Office 2007 after downloading and
installing a free add-in. Internet:
www.microsoft.com

Office and creation of PDFs in the new Microsoft Of-


fice 2007 is slightly different from the pro-
administration cess in previous versions of Office.
Many users all over the world use Micro-
soft Office programs to create their docu- PDF/A in Office 2007
ments. Frequently, working files in DOC, Unlike with the previous versions of Mi-
PPT, or Excel format are used for internal crosoft Office, Office 2007 enables the ex-
and external communication and for stor- port of PDF/A files without requiring the
ing files in archives. This process can some- use of Acrobat or the Distiller.
times cause problems for recipients and is Before the Office 2007 package was rolled
XPS is a device-independent not optimum for long-term storage. PDF – out at the end of 2006, discussions took
document format developed or, even better, PDF/A – minimizes if not place on PDF features. Adobe Systems and
by Microsoft. The abbreviation totally eliminates problems that occur Microsoft disagreed about the integration
stands for ‘XML Paper Specifi- when exchanging and archiving files. The of a direct PDF output function in Office
cation’. 2007 programs. As a solution to the dis-
pute, users must now download a separate
‘Save As PDF or XPS’ add-in from the Mi-
crosoft Web site and install it in the appli-
cation package later on. The following Of-
fice 2007 programs benefit from the new
export function: Access, Excel, InfoPath,
OneNote, PowerPoint, Publisher, Visio,
and Word.
Once the PDF export add-in has been in-
stalled, an extra option is added to the Save
As command that enables Office docu-
ments to be saved as PDFs. The ‘Options’
dialog box in the ‘Publish As’ area is a use-
ful feature. Users can select a checkbox
here to create PDF files in accordance with
the PDF/A standard: ‘ISO 19005-1 compli-
ant (PDF/A)’. When the user clicks ‘OK’,
PDF/A from Office 2007: The Options in
the program creates a PDF/A-1b-compliant
the PDF export dialog include the ‘ISO
19005-1 compliant (PDF/A)’ setting, file.
which enables the generation of a If users wish to proceed as in Office 2003
level B PDF/A. and use a connection to the PDF/Distiller

28 PDF/A in a Nutshell
PDF/A creation

settings to export PDFs from Office 2007, The ‘Settings’ tab consists of a dropdown
they should expect to experience problems menu with various options delivered with Acrobat 7 offered support for
in conjunction with Acrobat 8.0. Accord- Adobe Distiller. There are two PDF/A-1b preliminary versions of the
ing to the manufacturers, this is due to the variants here – one for four-color CMYK PDF/A standard.
fact that the rollout dates of the two soft- output, and one for RGB monitor output. As of Acrobat 8, full support of
ware solutions were so close together. An In this example, the RGB variant is used. the final PDF/A standard is of-
Acrobat update to version 8.1 should solve Clicking the ‘Advanced Settings ...’ but- fered.
these incompatibility issues. ton opens detailed Adobe PDF settings.
Users can change the image resolution
Office 2003 and the PDFMaker and compression type here, but it is im-
It is only possible to generate PDF/A docu- portant to take care not to make changes
ments from Office 2003 using the PDF- that could endanger the PDF/A-compli-
Maker add-in and a connection to Acrobat ance of files (for example, for Acrobat
(or the Adobe Distiller). Acrobat 8 Profes- compatibility). However, let us return to
sional provides current conversion settings the conversion settings tabs in the Mi-
for PDF/A. Users can create both PDF/A-1- crosoft application.
a-compliant and PDF/A-1b-compliant files
from Office programs. Be careful with the security settings
Because security settings – passwords for
Settings for PDF/A-1b opening, printing, or changing PDF files –
The Office application menu (for exam- are not permitted in PDF/A files, users
ple, in Word) has an ‘Adobe PDF’ entry should not make any changes on the ‘Secu-
that enables the triggering of PDF gener- rity’ tab. Users who wish to protect their
ation and access to the presettings. The PDF/A files must protect the storage loca-
‘Change Conversion Settings’ command tion of these files. This can be achieved by
opens a dialog box where users can select implementing password protection for a
options and make additional settings. folder or drive, for example. ➔

Adobe PDF in Word 2003: The Conversion


Settings are essential tools for successfully
producing PDF/A files. The settings for
PDF/A-1b produce files that are suitable
for long-term archiving. PDF documents in
RGB color mode are intended to be dis-
played on monitors; those in CMYK mode
are primarily intended to be printed. The
conversion setting for PDF/A-1a can be ac-
tivated by selecting the relevant checkbox.

PDF/A in a Nutshell 29
PDF/A creation

Options for Word


The settings on the ‘Word’ tab relate to
issues such as comments, tables of con-
tents, and the ‘Enable advanced tagging’
function.
All of these features help users to pro-
duce structured PDF files (tagged PDFs).
However, it only makes sense to adopt
tags when carrying out a PDF conversion
if the source Word document is already
completely and consistently structured
using formats. (For more information,
see the ‘Accessible PDF files’ chapter on
page 52.)
Nevertheless, it is possible to success-
fully create PDF/A-1b-compliant files
without using such structural elements.

Bookmarks
Users can choose to use Word formats for
the generation of PDF bookmarks. Book-
marks are permitted for PDF/A. Users
The ‘Word’ tab contains the ‘Enable advanced tagging’ checkbox, can make personal specifications for
which is useful for users who want to generate structured PDFs.
styles, headings, or Word bookmarks.

PDF/A-1a: This PDF conversion setting is activated by selecting a So how do you create a PDF/A-1a-compliant
checkbox. It activates a function that can convert the advanced fea- file?
tures of the higher compliance level, such as fonts and structure, The conversion setting for PDF/A-1a
from Office documents into the resulting PDF files. takes the form of a checkbox in the PDF-
Maker Settings. If this checkbox is acti-
vated, the settings in the ‘Advanced Set-
tings’ pulldown menu are locked to pre-
vent users from making conflicting set-
tings.

Starting the conversion: This button is used to trigger PDF conver-


sion using PDFMaker. It uses the current default settings to do so.

This constitutes the entire setup proce-


dure for the PDF/A generation. To create
a future-proof PDF, the user now only
has to click ‘Convert to Adobe PDF’. n

30 PDF/A in a Nutshell
PDF/A creation

Printer selection: Selecting the 3-Heights


PDF Producer as the printer enables the
generation of PDF documents from any
Windows program.

PDF/A using the 3-Heights PDF Producer for redistribution on clients and multi-user
Exporting PDF from Window applications servers. Swiss-based PDF Tools AG pro-
is not only a facility that is offered in more vides a whole host of tools and libraries for
recent Office versions or in conjunction the creation and processing of PDFs. The
with the Adobe Distiller – there is a whole company’s products can be purchased di-
range of converters that can generate PDF rectly or via OEM partners. A free test ver-
documents. However, only a few products sion of the 3-Heights Producer Developer
3-Heights PDF Producer: This solution
are capable of handling PDF/A. Kit (SDK) is available on the manufactur- latches on to Windows’ print functions to
PDF Tools AG’s 3-Heights PDF Producer er’s Web site: www.pdf-tools.com. n deliver different types of PDFs, including
produces PDF/A-compliant files for long- PDF/A.
term archiving. This tool is capable of cre-
ating PDF documents that meet various

Windows
Applications 3-Heights™ PDF Producer

specifications (including PDF/A-compli- PDF


GDI Printer Driver
ance) from any Windows program using
PDF/A
GDI printer drivers. The 3-Heights PDF PDF 1.4
Producer offers both synchronous and par- PDF 1.5
API
allel generation of PDF documents. The 3-Heights™
PDF Kernel
tool also supports both client-side and
server-side PDF generation.
In addition to a software developer kit
for application development, runtime pack-
ages are also available as installation kits

PDF/A in a Nutshell 31
PDF/A creation

PDFlib for high-volume PDF/A generation


PDF/A ‘en masse’ The Munich-based company PDFlib sup-
In some cases, instead of needing to ar- plies tools for developers. The PDFlib pro-
chive single documents or hundreds of gram family is used to produce and process
documents per day as PDF/A, users need PDFs, and enables PDF documents to be
to archive large datasets consisting of generated from structure data (text from
tens of thousands of documents. The databases, XML) using a library (‘lib’
number of invoices, contractual docu- stands for ‘library’). The new PDF files that
ments, and receipts regularly generated are created in this way can be filled with
by companies working in telecommuni- variable content if required. This might in-
cations, energy supply, or public admin- clude adding different names for invoice
istration, can be extremely substantial. forms or business cards.
Since these documents are normally per- PDFlib products for the automatic gen-
sonalized (that is, addressed to certain eration of PDFs in high-throughput condi-
recipients), databases or structured data tions are used in business and prepress
often come into play when creating workflows and in the Web2Print field. The
them. library has supported the important PDF/X
prepress standard for years. As of PDFlib 7,
it also supports the high-volume genera-
tion of PDF/A-1a and PDF/A-1b docu-
ments.
The PDFlib product range offers PDF/A
support for various application areas.

■■ PDF/A documents can be created from


scratch. The process can draw on material
stored in a database.
PDF/A ‘from nothing’
This term refers to PDF files for which ■■ Scanned documents or other pixel-
there is no fully-formed source document. based image files can be converted to
Instead, they are generated ‘on-the-fly’ PDF/A.
from variable elements. Example: An In-
ternet supplier provides a password-pro- ■■ Existing PDF/A documents can be sub-
tected area where customers can down- jected to further processing in an automat-
For more information on PDFlib
solutions, see www.pdflib.com load current invoicing documents. Vari- ed workflow. For instance, they can be
on the Internet. able data such as names, addresses, cus- merged or split.
tomer numbers, and invoice details are
delivered from a database. The page lay- ■■ In addition, the PDFlib can create
out, company logo, and a current adver- PDF/A-1a files that contain all required
tisement are often compiled from data- structural information.
bases in accordance with the design speci-
fications of the company’s designers. More
rarely, fixed page backgrounds are used
and the personalized specifications are
added to them.
Solutions that are capable of generating
PDF documents ‘en masse’ from database-
supported content have been on the market
for a long time. However, PDF/A-compli-
ance is a relatively new feature. It was intro-
duced immediately after the adoption of
the PDF/A standard.

32 PDF/A in a Nutshell
PDF/A creation

Creating PDF/A from print data streams


Structured data or databases do not consti-
tute the only starting point for the high-
volume generation of PDF/A documents –
print data streams can also be used to cre-
ate a large number of PDF documents for
archiving. Print streams are used for batch
printing output. The print data can be con-
verted in order to generate formats that are
suitable for archiving, such as TIFF or
PDF/A. Apfelholz – photocase.com/de

DocBridge, a modular solution constructed


from several components, contains the Doc
Bridge Mill – a tool for processing a whole
range of file formats.
PDF has been part of Compart’s develop-
DocBridge by Compart ment scope as both an input and output
Compart, which is based in Böblingen, format for a long time. In the light of the
Germany, develops solutions for document adoption of PDF/A as a standard for long-
management and high-volume printing. term archiving, the company has added an
Medium-sized and large companies from option for PDF/A-compliant output to its
various industries use this supplier’s pro- products.
grams and services to automatically pro- For more information on Compart, visit
cess large amounts of data traffic. www.compart.net on the Internet. n

Compart DocBridge Mill: As well as struc-


Input Output turing, changing content, and creating in-
Datastreams Datastreams dexes, this solution can convert input data
streams to PDF/A.
AFP/MO:DCA AFP/MO:DCA

AFP Mixed Mode


PDF
PDF
PCL
PCL DocBridge Mill
ASCII/EBCDIC Line Mode
ASCII/EBCDIC Line Mode

LCDS/DJDE Restructuring IPDS

Metacode/DJDE Classification IJPDS


and Indexing
Lotus Notes CDR
Converting Metacode/DJDE
RTF Formats
SAP ALF + OTF PostScript
Changing
Page Content
SVG SVG

WMF
DjVu
PC Documents
Raster Formats
DjVu

Raster Formats PC Printer

... ...

PDF/A in a Nutshell 33
3. From PDF to PDF/A: Converting
PDFs to archive PDFs
Many users already use PDF to store docu-
ments in digital archives in companies,
public authorities, or privately. Now that
the PDF/A standard has been adopted, they
have the opportunity to create archive doc-
uments from their existing files, thereby
ensuring that they can be used in the long
term. In addition, recipients of traditional
PDF files that need to be retained but are
not yet available as PDF/A can now convert
them to archive PDF documents. In order
to do so, they need to know the answer to
the following question: How do you create
PDF/A documents from PDF files?

PDF/A generation with


Starting the Preflight tool: The command for opening the tool is lo-
Preflight cated on the ‘Advanced’ menu.
When Acrobat Professional (Version 8 or tual conversion is the integrated Preflight
higher) is used to convert PDF files to plug-in. Even if the conversion is triggered
PDF/A, the ‘engine’ that carries out the ac- using the Acrobat 8 export function or by

Karoline Swiezynski – photocase.com/de

34 PDF/A in a Nutshell
From PDF to PDF/A

clicking ‘Save As’, the Preflight module is Following the conversion: The Results win-
responsible for converting the file. dows shows the steps that were carried
The Preflight module is opened from the out and informs the user that the conver-
sion was successful.
Acrobat ‘Advanced’ menu or by pressing
Shift+Ctrl+X.
The lower section of the main Preflight
window immediately provides information
on the status of the opened PDF file with
regard to the PDF standard: Is the docu-
ment PDF/A and/or PDF/X-compliant?
(PDF/X is a prepress standard.) If the PDF
profile if it is not required. This reduces the
resulting file size.
When the user clicks the ‘OK’ button,
the Preflight tool searches the existing PDF
document to see whether it meets the pre-
Preflight: The PDF/A icon is also a pushbutton that triggers conver-
requisites for successful conversion to
sion to PDF/A.
PDF/A. If the prerequisites are met, the
file was not created as a PDF/A, the user re- conversion takes place. The green tick in
ceives a message telling him or her that the this example shows that no problems oc-
file is ‘not a PDF/A file’. If the user now curred during the conversion. Details on
wants to trigger PDF/A conversion, he or the conversion process are shown in the
she can simply click the PDF icon. Results window in the form of a list. The
The Preflight tool uses a dialog box to ask list contains information such as the fact
the user whether the existing PDF files that the tool added the file name suffix
should be converted to PDF/A-1a or to a re- ‘_A1b’ to the source document.
stricted PDF/A-1b version.
Conversion to PDF/A-1a
Conversion to PDF/A-1b The second scenario describes the conver-
In the first scenario, the user selects the sion of a PDF file to PDF/A-1a. The proce-
‘PDF/A-1b’ standard and sets the output
condition to ‘sRGB’ in the dialog box. This
indicates that the PDF in question is des-
tined to be displayed on a monitor. Since
the PDF file quite possibly already contains
an output intent, the tool provides a check-
box that specifies that the present intent is
to be used. In addition, another checkbox
prevents the embedding of the ICC color

PDF/A-1a: Conversion settings with an output intent for profession-


al offset printing.

dure is the same as for scenario 1 except


that the compliance level ‘1a’ is chosen
along with an output condition that is suit-
able for four color printing (for example,
ISO Coated).
Again, the Preflight tool checks that the
Preflight: The PDF/A conversion options relating to the level of the relevant document meets the prerequisites
PDF/A standard (1b in this example) and the output intent. for the conversion.  ➔

PDF/A in a Nutshell 35
From PDF to PDF/A

In this second example,


a red X clearly indicates
that the conversion can-
not be carried out suc-
cessfully. Preflight uses
the Results window to in-
form the user of the prob-
lems that occurred. An
additional area below the
list explains why the prob-
lems that occurred pre-
vent the document from
For more information on inte- being successfully con-
grating this structural informa- verted to PDF/A-1a.
tion via tags either before or The file does not have
after conversion, see the ‘Acces- the required MarkInfo
sibility’ chapter on page 50. entry. This error message
is relatively common if
the person generating the
PDF has not structured
Conversion not possible: If the PDF file in the content of the docu-
question does not meet the prerequisites ment using tags before-
for conversion PDF/A, Preflight terminates hand. This structural in-
the conversion process and provides the
formation is one of the
user with detailed information on the rea-
sons for the failure of the process. For an things required in order to define the text Profile list in Preflight: Both verification and conversion can be car-
extensive overview of these error messag- flow order for document layouts that have ried out using the delivered PDF/A profiles.
es, see the appendix. multiple columns, images, and captions. In this example, the source PDF document
must be re-exported from the source pro-
gram either using different preparation
methods/settings or repaired.

Direct selection of a profile


Experienced users can take advantage of a
more direct way of selecting the required
PDF/A test or conversion profile.
For example, they can choose to select
one of the PDF/A profiles from the list in
order to check a document for PDF/A suit-
ability or, if possible, to immediately con-
vert it to PDF/A-1a or PDF/A-1b using a
specified output condition. The conversion
profiles are all assigned one of the four
most common output intents.
If the output intent required for a special
workflow is not contained in the list, a new,
modified PDF/A profile can be set up on
the ‘Edit Profile’ screen.
The user selects the required profile for
the verification or conversion from the list
and clicks ‘Execute’. Processing can also be
started by double-clicking the correspond-
ing profile name. n

36 PDF/A in a Nutshell
From PDF to PDF/A

First and last steps: pdfaPilot starts the


conversion process when the user clicks on
the orange pushbutton. When the conver-
sion has finished, the info field shows that
the document is now PDF/A-compliant.

For information on pdfaPilot,


including a downloadable
demo version, go to the fol-

Converting PDF to PDF/A ceive tips on how to solve the problems en-
countered in order to be able to carry out a
lowing Internet address:
www.callassoftware.com

with pdfaPilot successful conversion to PDF/A next time.

Thanks to its largely self-explanatory user High-volume processing with pdfaPilot CLI
interface, callas software’s pdfaPilot allows The pdfaPilot CLI (Command Line Inter-
even unexperienced users with no prior face) is designed for high-volume PDF/A
Automation: pdfaPilot is also available as
knowledge to convert documents to PDF/A conversion and validation. This solution
a command-line (CLI) module. pdfaPilot
and verify them. This professional tool is a enables the server-based, automated gen- Validator CLI is a pure validation tool and
plug-in for Adobe Acrobat Standard and eration of PDF/A files in companies or ad- pdfaPilot Converter CLI can validate, cor-
Professional Versions 6, 7, and 8. The con- ministrative departments. n rect, and convert files.
version from existing PDF documents to
PDF/A normally needs three steps and can
be achieved in maximum of four:

■■ The PDF document to be converted is


opened in Acrobat. pdfaPilot is called up
from the tool icon or using the ‘Plug-Ins’
menu item.

■■ Clicking on the ‘Convert to PDF/A-1b’


pushbutton causes pdfaPilot to start the
conversion process.

■■ If the conversion can be carried out with-


out problems, a dialog box informs the user
that the conversion was successful. If the
tool found elements or settings for the PDF
file that prevent it from being converted to a
PDF/A-compliant document, it reports these
elements instead. Users can open these error
messages by clicking them. They then re-

PDF/A in a Nutshell 37
4. Is this really a PDF/A file?
PDF/A validation
A PDF/A document created with Adobe
Acrobat can be easily recognized by the file
name extension ‘_A1a’ or ‘_A1b’. Other
PDF/A generators use similar procedures.
compliance as a result of unintentional or
deliberate changes without it being obvious
that it is no longer compliant with the stan-
dard.
So why is an additional check needed when However, further investigation using
you receive a PDF/A file by e-mail or open a tools such as Adobe Acrobat Preflight, cal-
document from an archive? las software’s pdfaPilot, or PDFlib 7 by
The answer is simple: Because PDF/A PDFlib, all of which are specially designed
files cannot be protected from further edit- for PDF/A validation, can safely and reli-
ing by measures including encryption or ably uncover this kind of problem.
passwords. Doing so would contradict the Of course, even deception cannot be
PDF/A regulations, since PDF/A content ruled out – it is quite possible for users to
must be available in its entirety without se- manually add a file suffix such as ‘_A1b’ to
curity measures. a PDF file before sending it even if the file
This means that a PDF/A file that was in question has never actually been con-
once standard-compliant can lose that verted to PDF/A. This is why checks consti-

Paul Schubert – PixelQuelle.de

38 PDF/A in a Nutshell
PDF/A validation

tute a prerequisite for the successful use of


PDF/A. PDF/A data should be validated for
standard compliance at two places in the
process flow: When PDF/A files are received
and before (external or internal) PDF/A
documents are transferred to a digital ar-
chive (data storage drive, CD-ROM, or
DVD-ROM).

that names the output intent contained in PDF/A status: The status icon has three
Validation with Preflight the PDF document and informs the user possible states: A file can be not yet vali-
dated, successfully validated, or have
Acrobat 8 Professional’s Preflight tool is that the file has not yet been validated.
failed the validation.
not designed only for the creation of PDF/A If the PDF/A icon does not appear in the
Preflight window, the status display may be
Calling up Preflight: This tool is called from deactivated in the Preflight preferences.
the Acrobat menu (using the
Clicking the icon starts the Preflight
‘Advanced’ menu item), by pressing
Ctrl+Shift+X, or by clicking the tool icon. PDF/A check. The tool works through a list
of conditions that the PDF document must
fulfill in order to comply with the PDF/A
files – it can also be used to test and vali- standard. More than one hundred specifi-
date PDF/A documents for their actual cations must be observed in order for a
compliance with the standard. document to be declared standard-compli-
The PDF/A icon at the bottom left of the ant.
Preflight window gives a quick overview of If the check finds no deviations from the
Successful validation: Clicking on the
the PDF/A compliance of an open docu- standard, the software indicates that the
PDF/A icon with the yellow question mark
ment. If a user opens a PDF/A file that has PDF/A file is standard-compliant (indicat- starts the validation process. The result (in
not yet been validated, the yellow question ed by the green tickmark) and names the this case – successful) appears after a few
mark icon appears along with a message output intent. ➔ seconds. Everything's fine.

PDF/A in a Nutshell 39
PDF/A validation

No valid PDF/A file: In this example, the The PDF/A validation fails if the document red X. The Preflight results window con-
Preflight validation process has found a being checked does not meet all of the tains a list of the problems encountered.
problem: The insertion of a watermark after specifications stipulated by the standard. If Users can click the entries for more infor-
the creation of the file added a PDF layer to
this is the case, the system informs the user mation on the various error messages. Pre-
the file. PDF layers are not permitted in ac-
cordance with the PDF/A standard. that problems have occurred by means of a flight can also highlight the places where
these problems were found (if the elements
allow it to do so). The detailed information
can also be viewed by double-clicking an
entry in the list.
Because these error messages are not al-
ways self-explanatory, this publication
contains an appendix that lists detailed
background information on all possible
errors in alphabetical order. Preflight also
gives the user tips on how to repair errors
that have occurred or how to avoid them
next time around (see information start-
ing on page 68).
Following a failed validation attempt,
the PDF/A status is also indicated by a red
X in the main Preflight window. n

Should be PDF/A-compliant – but isn’t:


The Preflight main window uses a red ‘X’
to indicate a failed PDF/A validation.

40 PDF/A in a Nutshell
PDF/A validation

pdfaPilot PDF/A Problem found: Layers are not permitted


in PDF/A-compliant files. Clicking on the
Like the creation of PDF/A files, pdfaPilot orange conversion pushbutton elimi-
nates this problem and generates a valid
can validate PDF/A files in just a few steps.
PDF/A file.
Clicking the ‘Check for PDF/A’ pushbut-
ton causes the tool to examine the open file.
If pdfaPilot does not find any problems, it
reports a successful check by displaying an The lower pdfaPilot pushbut-
icon containing a green tickmark in the ton changes depending on
info area. It also gives the user further in- whether the PDF file validation
formation on the PDF document, including resulted in serious, minor, or
the title, author, number of pages, page size, no compliance issues.
creation program, and program of origin. - If no problems exist, the PDF is
If the validation fails, the system issues declared to be a valid PDF/A
an error message informing the user of the file.
problem and listing the ways in which the - If there are only minor prob-
document in question deviates from the lems, the document can be
PDF/A standard. converted to a standard-com-
In most cases, pdfaPilot can carry out a pliant PDF/A file by simply
conversion to generate a PDF/A file that is clicking the pushbutton.
suitable for long-term archiving even if - Serious problems must be elim-
problems occur during the validation pro- the software precisely explains the steps inated in the original file.
cess. To enable this, the developers of the that need to be taken before the creation of
tool integrated correction options in the the PDF in order to enable eventual conver-
tool that far exceed the functional scope of sion to PDF/A. n
Acrobat Preflight. However, the application
is no more difficult to use. If the problems Explanation: Detailed information ex-
are not corrected immediately in pdfaPilot, plains the context of the error messages
and helps users to eliminate problems.

Validation using pdfaPilot: The results show that the file is


PDF/A-1b-compliant. In addition, the tool collects details on the
existing file and presents them in an overview.

PDF/A in a Nutshell 41
5. Archive PDFs in everyday life:
What issues might arise?
PDF/A requirements can change accord-
ing to the environment in which the
PDF/A files are used and the task to be
done. One user might produce PDF/A files
They may not be allowed to ‘go missing’
over the course of time, as can happen with
other file formats that specify a link to an
external storage location rather than inte-
that only contain text and no illustrations, grating images into files. Most of us will, at
another might require signatures, and a some point, have called up a Web page only
third might need to create PDF documents to find that the illustrations are missing
that can be archived and also conform and question marks or red crosses in frames
with accessibility requirements. The in- are displayed instead. This cannot happen
formation below provides details on sev- with PDF/A.
eral usage possibilities and areas where An image on a PDF/A page is also clear-
PDF/A can be used. ly reproducible because it exists once and
only once. On rare occasions – and only in
the prepress area – alternate images are
Images used. These images contain a lower-reso-
All images contained in PDF/A files must lution variant for the screen and a high-
be clearly reproducible. This can only be resolution variant for printing. PDF/A
ensured by integrating them into the files. does not permit alternate images, partly

Markus Hein – PixelQuelle.de

42 PDF/A in a Nutshell
PDF/A applications in everyday life

because it cannot be guaranteed that the


two variants have exactly the same con-
Color depths and grades
tent.
Black and white:
Resolution is not part of the PDF/A standard Line art image: 1 bit 2 grades
Image resolution does not have a role to
play when it comes to compliance with the
PDF/A standard. This is because there is no
single image resolution that is considered
to be universally ‘correct’. For example, Continuous tone:
screenshots tend to have a resolution of of Grayscale: 8 bits 256 grades
72 or 96 ppi (pixels per inch). A common
resolution for printing is 300 ppi, but it
would not make sense to increase the reso-
lution of a screenshot to the normal print- Color/RGB: 24 bits 16.7 million grades
ing resolution because it would not convey Color/CMYK: 32 bits 4.3 billion grades
any additional information to the user. If
the worst comes to the worst, ill-considered
increases in resolution can cause fuzzy Line art and black and white images only have two grades.
edges. The maximum sensible image reso- Half-tone images have different numbers of grades depending
lution for screenshots is 72 ppi. on the color model (grayscale, RGB, or CMYK). The compression
options differ for black and white images and half-tone images.
Another area that often deals in low res-
olutions is astronomical photography.
Some of the images transmitted to Earth by
telescopes such as the Hubble Space Tele- Permitted and prohibited compression
types
The choice of image compression type – the
procedure used to minimize the quantity of
image data – is not entirely down to the user.
There are two basic types of
There are two types of pixel image: Half- compression: Lossless com-
tone images (grayscale and color images) pression and techniques that
and line art images that consist of only two can damage the quality of im-
colors. Line art images can be compressed ages to a lesser or greater ex-
for use in PDF/A using ‘CCITT Group 4’, a tent (‘lossy’ compression).
technology that is effective and prevents loss
‘X-Ray Stars in M15’ – N. White & L. Angelini (LHEA), GSFC, CXO, NASA; www.nasa.gov of data. Programs that use this compression
Low image resolution is no problem for PDF/A: Since there are sub- type include Acrobat Distiller and Acrobat
jects where only low resolution can be achieved, the PDF/A standard Professional.
does not regulate image resolution. The choice of compression types for half-
tone images is greater, and not all types are
scope are extremely grainy images of dis- PDF/A-compliant.
tant stars or galaxies. These low-resolution Of the compression types that prevent
images are the best that can be achieved, loss of data, LZW, a rather old compression
and it is, of course, quite possible to create type, is prohibited. It was decided to pro-
valid PDF/A documents from them. hibit the use of this compression type be-
Image resolution is not regulated by the cause it was once protected by license. Since
PDF/A standard – it is left to the decision of the more modern ZIP compression type is
the creator of each PDF/A file. Users must both permitted and prevents loss of data, it
decide for themselves whether or not the is the recommended compression method
image resolution used is the best resolution to be used for the compression of half-tone
possible. images without data loss.  ➔

PDF/A in a Nutshell 43
PDF/A applications in everyday life

Acrobat 6 (PDF 1.5). Because PDF/A-1a and


JPEG compressions (magnified) -1b are based on PDF 1.4, JPEG2000 is pro-
JPEG minimum hibited simply because it was introduced
285 KB too late. However, the version of the stan-
(US Letter page) dard that is currently being compiled,
PDF/A-2, will include JPEG2000. For the
moment, if a PDF document contains im-
JPEG medium ages compressed using JPEG2000, the user
325 KB
can replace the prohibited compression
(US Letter page)
type with JPEG or ZIP using the PDF Opti-
mizer in order to achieve PDF/A compli-
JPEG high ance. This checklist lists specifications for
405 KB images in PDF/A documents:
(US Letter page)
■■ All images must be an integral part of
the PDF file in which they appear.
JPEG maximum
509 KB ■■ Alternate images are not permitted.
(US Letter page)
■■ The user must decide on the image reso-
lution and compression level (both factors
The JPEG compression rate affects the image quality. The compression rate is down to the decision of the user, influence the image quality).
and is not regulated by the PDF/A standard.
■■ The compression types LZW and
ZIP is not subject to license-related re- JPEG2000 are prohibited.
strictions. LZW compression in a PDF can
also be replaced by ZIP compression later ■■ For more information on colors (includ-
on. The PDF Optimizer feature in Acrobat ing colors in images) see the information
is designed for this purpose. starting on page 46.
JPEG was the first procedure to achieve
relatively high-quality results from com-
paratively small images, despite being sub- Transparency
ject to data loss. For this reason, the tri- Transparent objects are not allowed in
umph of PDF in some situations would not PDF/A-compliant documents. At the point
have been possible without JPEG. JPEG en- when the PDF/A standard was adopted,
ables different file sizes. The image quality
can be set in steps from ‘minimum’ to
‘maximum’. If a high level of compression
is chosen, block artifacts form. Depending
on the nature of the image, they can be
clearly visible on the screen. In the case of
images with sharp edges (such as text), high
levels of compression can be particularly
awkward. However, just as for image reso-
lution, the user decides on the level of JPEG
compression; the PDF/A standard does not
make any specifications.
However, the standard does prohibit the
use of JPEG2000, a compression type that
was developed by the same group as JPEG Transparency: The upper image above is transparent. The transparent
Blocks: Zoomed view of JPEG artifacts re- (the Joint Photographic Experts Group). object can be recognized easily, since the background is visible through
sulting from heavy compression. JPEG2000 was introduced for PDF with the image. The lower image has an opaque foreground image.

44 PDF/A in a Nutshell
PDF/A applications in everyday life

Adobe had not yet completely formulated


the algorithms for evaluating transparency
in a completely clear manner. As a result,
transparency is currently prohibited in the
PDF/A standard. This will change in
PDF/A-2.
Transparency can effect images, graphics,
and text. Transparent objects are not 100%
opaque – instead, their background shows
through, as is the case for glass or thin
parchment. Transparent objects cannot al-
ways be detected with the naked eye, espe-
cially since opacity can be as high as 99%.
Transparent elements do not only occur
if they are explicitly created. Certain pop-
ular design functions such as drop shad-
ows and soft edges can create ‘sneaky’
transparent objects. For example, many
PowerPoint presentations contain trans-
parent objects, even if they cannot be de-
tected at a glance. If text or other elements objects be avoided or removed later on? Risk of transparent objects: Drop shadows
are given drop shadows, transparent ob- This is normally done by transparency and soft object edges can cause the cre-
jects are created when they are converted flattening (transparency reduction). This ation of transparent elements in a PDF file.
This example is from a PowerPoint presen-
to PDF. procedure involves merging the transpar-
tation.
Another widely used aid in Office envi- ent area and the background in a way that
ronments is the function that allows text retains the appearance of the image. In
to be highlighted with a digital marker addition to certain professional layout
pen. This function is also available in Ac- programs that can carry out transparency
robat Professional. However, this also flattening in advance, this process can
causes transparent objects to be created in also be carried out during PDF optimiza-
PDF files. So how can such transparent tion in Acrobat Professional.  ➔

Transparency resulting from the use of a


highlighter: Use of the Highlight Text Tool
can cause the creation of transparent ob-
jects in a PDF document.

PDF/A in a Nutshell 45
PDF/A applications in everyday life

PDF Optimizer: This Acrobat function en-


ables transparency flattening with differ-
ent quality settings. To ensure PDF/A
compliance, it is important to select a
compatibility level no higher than Acro-
bat 5 (PDF 1.4).

When flattening transparency, the user of colors for text, image, and graphical el-
can choose between different quality levels ements.
(from low resolution to high resolution),
since this process generates new images out
of overlapping graphic objects.
However, users must be careful when re-
moving highlighted text. Instead of using
transparency flattening, which would make
the yellow highlighting opaque and hide
the text, the Acrobat PDF Optimizer func- Which color should it be? Without color management, the correct
Hidden text: Transparency flattening tion ‘Discard all comments, forms and depiction of colors in company logos is a question of luck.
should not be used for highlighted text. It multimedia’ should be used. This function
is better to use the ‘Discard all comments, can be called from the ‘Discard User Data’ Color management
forms and multimedia’ function, since the
area. PDF/A uses color management to safely de-
Highlight Text Tool is a comments tool.
pict colors. Color management is based on
the use of color profiles that are appended
Colors to image files, graphical documents, and
The colors of illustrations and graphics in a PDF files to act as a kind of instruction
document should always appear exactly the manual.
same – whether displayed on one’s own The RGB color space is widespread in Of-
monitor, on a colleague’s monitor, or viewed fice environments. sRGB (‘Standard RGB’)
as a printout. Nothing is more annoying is now being used to enable colors to be dis-
than a company logo that, when used in a played or printed as reliably as possible on
presentation or brochure, fails to depict the different devices and printers. The sRGB
corporate identity because, for example, it profile is suitable for images, graphical ele-
appears orange rather than magenta. ments, and text in Office documents. It was
Thanks to PDF/A, such problems are a developed by Hewlett-Packard and Micro-
thing of the past, since the PDF/A stan- soft in 1996 to make printed pages as simi-
dard guarantees the reliable reproduction lar to those displayed on the screen as pos-

46 PDF/A in a Nutshell
PDF/A applications in everyday life

sible. Common modern monitors and Acrobat, Preflight, and pdfaPilot, is ideal. The incorrect reproduction of colors can
printers support sRGB color adjustment. On the other hand, PDF/A files that are in- sometimes affect the message of an im-
Adobe RGB is another widespread RGB tended for printing can be given an ISO age: Was the evening spent at the lake de-
picted in these two photographs a warm
profile. It was published by Adobe Systems Coated profile.
evening or a cool one?
in 1998. This profile is most useful to peo-
ple who work with digital photographs,
since cyan and green tones appear to be
more natural with Adobe RGB than with
sRGB. For documents always intended for
four-color printing (production or digital
printing), the ISO Coated color profile con-
stitutes a good choice.

Output intent (output condition)


The profiles named above (and other pro-
files) can be passed to the conversion pro-
cess along with each individual object
placed in a document, but there is another,
more practical procedure, that is applied to
Output intent: What is the purpose of the PDF/A document? In this
the entire PDF/A file. An output intent (the case, sRGB is the chosen output intent. Acrobat (Preflight) and other
intended output condition) can be speci- converters deliver a range of profiles. In addition, users have access
fied for the PDF/A conversion process. For to other profiles stored on their computers.
example, if a PDF/A file is to be archived for
the purpose of being displayed on a moni- If the output condition of a PDF/A file
tor later on, the sRGB profile, which is a changes at any point in the future, color
standard part of PDF/A converters such as conversion processes can be triggered.  n

Safe depiction of colors in PDF/A


- If device-dependent colors are used, an output intent - It is also possible to use a single device-dependent col-
must be specified. or space (with no ICC profile).
- If there is already a source profile for all colors used, - Device-dependent CMYK and device-dependent RGB
there is no need to specify an output intent. may not be used together. If device-dependent colors
- If an output intent is used, if must have one output are used, there must be an output intent for the same
profile only. color space (RGB, CMYK, or Gray). However, only one
- Objects such as images and graphics can exist in different output intent color space may be used.
color spaces (RGB, CMYK, spot colors, grayscale and Lab).

PDF/A in a Nutshell 47
PDF/A applications in everyday life

Global depiction of characters: PDF/A en-


sures that international texts are dis-
played properly, since the embedding of
texts in documents means that all required
characters are actually available within
the document in which they are used.

Fonts played on a computer that does not have


the font used might not be displayed in its
If a PDF file contains text that uses fonts – entirety. This is incompatible with the re-
that is, text that has not been converted to quired visual reproducibility. The entire
Because entire fonts can con-
stitute large datasets, PDF/A paths/pixel-image text – there are a range font does not need to be embedded; it is
permits the embedding of sub- of specifications for achieving PDF/A-com- sufficient to embed only the characters that
sets. This means that only pliance. The specifications for PDF/A-1a are used in the document. This is known as
characters used in a particular and PDF/A-1b are different. However, we ‘embedding subsets’.
PDF document are embedded. shall first deal with the common specifica- In the light of the international exchange
This limits the file size. tions for both standards. of documents containing special characters
that the recipient might well not have on
Embedding fonts his or her computer, the use of embedded
The following applies to both compliance fonts is a significant advantage. Modern
levels – PDF/A-1a and -1b: All used fonts operating systems increasingly provide Cy-
must be embedded into the PDF file in rillic, Asian, and Eastern-European fonts
question. If this were not the case, text dis- in order to enable the display of interna-

photocase.com/de

48 PDF/A in a Nutshell
PDF/A applications in everyday life

Missing characters: In this case, it is impossible to tell whether the


transfer should be 100 €, £, or ¥. This cannot occur with PDF/A.

tional Internet pages, but there is no guar-


antee that the fonts delivered will be the
fonts used by the creator of a particular
PDF document.
Unlike in the early days of PDF, embed-
ding fonts with the current program ver-
sions of Acrobat and many other profes-
sional tools for creating PDFs is not diffi-
cult. However, even today there are solu-
tions – even at industry level – that do not
fulfill the font handling specifications stip-
ulated by the PDF/A standard.

Fonts must be uniquely encoded


Problems with character set encodings can
cause individual font glyphs to be displayed
incorrectly or not at all in documents in-
cluding Word documents and e-mails.
‘Glyphs’ are the graphical depictions of
characters. PDF/A-1a. All of the characters embedded The letter ‘O’ or the digit ‘0’? This example
What might happen if character set en- in a PDF/A-1a file must be uniquely identi- shows that it is not always possible to dis-
coding is inconsistent? When the euro was fiable by means of their Unicode name. tinguish between certain characters with
the naked eye. This is where Unicode
introduced, problems were often experi- Unicode is an international standard that
comes into play, since it defines a Unicode
enced with the € sign. The characters ‘ä’, assigns a unique ID number to every char- name for each individual character. (Illus-
‘ü’, and ‘ö’ often cause difficulties in inter- tration: Linotype FontExplorer X)
national communication. The PDF/A stan-
dard now requires glyphs that are used in
documents to be uniquely encoded in order
to guarantee correct reproducibility.

U+0061: All of these letter ‘a’s have the same Unicode numbers, re-
gardless of the font.
acter and symbol that exists worldwide
Tracking information: The information on tracking has been lost in (even for historic script). The Unicode Con-
the case of the overlapping letters. sortium and ISO work together on this
Overlapping letters such as those that project. Unicode encodes only abstract
can occur when copying text are also elim- characters, not glyphs (the various graphi-
inated by compliance with the PDF/A stan- cal depictions of letters).
dard. The gobbledegook shown here is The use of Unicode encodings for PDF/A-
caused by missing tracking information. 1a brings the advantage of all character-
This problem cannot occur if PDF/A is based text being completely unique. This
used. enables text to be searched precisely and re-
liably for content as well as allowing con-
Unique characters with PDF/A-1a – thanks to tent to be reused. This is not completely
Unicode guaranteed in the case of PDF/A-1b docu-
In addition to the points mentioned above, ments, although it should usually be the
a further font requirement applies to case.  n

PDF/A in a Nutshell 49
PDF/A applications in everyday life

Metadata mation on a person or place depicted in an


image, the author of a document, and any
Many current file formats allow the storage copyright restrictions.
of metadata. Metadata is data passed for a
document over and above the actual work- PDF/A and metadata
ing data. This might include technical in- Many issues relating to the use of meta-
formation (for example, a digital camera data when creating valid PDF/A docu-
saves additional information called EXIF ments are left to the decision of the user.
data for each image file, including the ex- However, the following directives apply:
posure, aperture, and focus). In addition,
users can add descriptions to files later on, ■■ One metadata field is mandatory: The
such as keywords or copyright notices. PDF/A identifier. This identifier is nor-
IPTC metadata, primarily used by profes- mally automatically written to the relevant
sional photographers, has been around for field in its correct form by the PDF/A con-
years. verter used to create the PDF/A document
Many programs allow metadata on a file in question.
to be viewed or even changed in the ‘Prop-
XMP uses the RDF to embed erties’ area. This normally includes core ■■ All metadata attached to a PDF file
meta information in binary information such as the title of the docu- must exist in a certain form and must be
data. RDF stands for Resource ment, the author, and the program used to encoded in an XMP-compliant manner.
Description Framework and is create it. Although PDF/A only stipulates a single
a formal language for staging Metadata provides useful information mandatory field, it makes sense to make
metadata on the Internet. that can simplify the organization of large the most of the possibilities of XMP to en-
To promote the use of XMP, numbers of digital documents, either in able efficient archiving and powerful
Adobe provides the XMP speci- database solutions or using search func- search and sorting functions.
fication and a software devel- tions. Metadata is particularly important
oper kit under an Open Source for archiving, since it can provide infor- What is XMP?
license for use by all. Metadata is another topic where standards
Internet: www.adobe.com/ are important. This type of data cannot be
products/xmp used effectively if every single user or user
group develops their own system for cre-
ating and managing additional informa-
tion. In the case of metadata systems that
already exist in parallel, a reliable method
for converting one to the other must be
provided at the very least.
To promote the standardization of
metadata systems, Adobe Systems is now
using the Extensible Metadata Platform
(XMP). XMP is a procedure that acts as a
kind of bracket that pulls together estab-
lished metadata formats such as IPTC and
EXIF. Acrobat Professional and Adobe
Reader are two of the applications that
display XMP metadata; Acrobat Profes-
sional also allows it to be edited. Other
manufacturers also use XMP – the tech-
nology is not limited to Adobe.
Analog metadata: Metadata also occurs in
the analog world in the form of informa- Viewing and editing PDF metadata in Acrobat
tion such as imprints and mastheads for Metadata can be viewed at the following
books and journals. menu path: ‘File’ → ‘Properties’. The ‘De-
aoe

50 PDF/A in a Nutshell
PDF/A applications in everyday life

scription’ tab contains fields that specify Document Properties: There are four basic
the title (which does not have to be the metadata fields on the initial screen of this
same as the file name), author, subject, area: Title (this field is usually prefilled on
the basis of the source document), Author,
and keywords (freely definable). The ‘Title’ Subject, and Keywords. Note the ‘Addition-
field is normally filled with the file name al Metadata...’ button. It calls the dialog
of the original file. The other fields can box shown below.
contain metadata from the original file if
the user gave them XMP-compliant data
and as long as the PDF is not being created
using the Distiller. Programs in the Adobe
Creative Suite pass on XMP metadata to
PDF documents that are created using the
Export function. The extent to which
metadata can be transferred from Word or
Excel files to the corresponding PDFs de-
pends on factors including the program
version being used. additional descriptions. Other programs
Clicking the ‘Additional Metadata...’ but- than Acrobat (such as Adobe Bridge) and
ton displays a whole range of further cate- products and solutions offered by other
gories including options for copyright in- suppliers are recommended for the mass
formation, personal processing notes, and allocation of metadata in PDFs.  n

Additional Metadata...: This section con-


tains additional fields for XMP-based
metadata. These fields can be filled in for
the PDF but may already contain entries
that were adopted from the metadata in
the source document.

PDF/A in a Nutshell 51
PDF/A applications in everyday life

Accessibility nobody is prevented from accessing public


information simply because he or she has a
Accessibility refers to the concept of pro- physical disability. Access to the increasing
viding technical aids that make it easier for amount of digital information must also be
people with disabilities to participate fully ensured for members of the public who are
visually impaired or have restricted motor
skills.
PDF has a range of useful functions for
enabling the accessibility of content: The
free Adobe Reader can read text out loud.
Magnification and contrast options enable
text to be read by readers who have im-
paired vision.

Structured: PDF/A-1a-compliant documents and


accessible PDFs
Accessible PDFs and PDF/A-1a-compliant
documents have many things in common,
and it is perfectly possible to create files
that are both. Both PDF/A-1a-compliant
documents and accessible PDFs require a
structure with well-defined content.
This structure is realized by means of
‘tagged PDF’. The tags used give each PDF
Accessibility feature: Adobe Reader can in everyday life. Examples of successful ac- element additional information on content,
read PDF files out loud – including form cessibility measures include wheelchair ac- position, and type.
fields. However, operating systems only cess to the metro and buttons with braille Tags also define the order of content,
deliver English ‘voices’ – for other lan-
lettering in elevators. In today’s informa- which is particularly important in the case
guages, such as German, digital voices
must be purchased at additional cost. tion society, it is important to ensure that of pages with multi-column layouts. In

Jens Goetzke – PixelQuelle.de

52 PDF/A in a Nutshell
PDF/A applications in everyday life

Tagged PDF in Acrobat: Structured PDFs


specify the exact sequence of their content.
In the case of multi-column layouts, it
might be impossible for software to auto-
matically determine the order of the con-
tent. The authors of documents must spec-
ify this sequence.

addition, tags can be used to distinguish nate text’ that explains the subject, the
between content and additional elements user is told not only that there is a graphic
such as headers and footers or other back- at the relevant point in the text but also
ground elements that do not directly be- that the graphic displays a guitar, for ex-
long to the content. Tags are also helpful ample.
for graphics and images on PDF pages. It is relatively easy to generate a PDF/A
How do screen readers deal with images? document from an accessible PDF and vice
If the creator has given the image ‘alter- versa. Note that the conversion to PDF/A
takes place at the very end of this process.
Once a valid PDF/A file has been created, it
cannot be changed – otherwise, it loses its
compliance status.

Advantages of accessible PDFs


Thanks to tagged PDF, there are tangible
advantages of accessible PDF files. Struc-
tured PDF files are easier to reuse than tra-
ditional documents. This means that reli-
able results can be obtained when convert-
ing formats (for example, PDF to HTML,
TXT, or RTF).
In the case of PDF documents that are to
be displayed on the small screens provided
by mobile devices such as handhelds and
cellular phones, the reflow function enables
Tags: Each element in a tagged PDF file has a tag that contains in- better readability. This text function can
formation on its type, position, and content. These tags are used to only be carried out without errors if tagged
define the exact structure of the document. PDF is used.  ➔

PDF/A in a Nutshell 53
PDF/A applications in everyday life

■■ To ensure accessibility, a language must


be defined for the Word document. This
setting is made under ‘Tools’ → ‘Languages’
in the menu. However, PDFMaker does not
always transfer this information to the PDF
file.

■■ The text in the Word document should


be structured using styles (such as ‘Head-
ing’, ‘Body Text’, and ‘List Bullet’.

■■ Multi-column layouts must be defined


using the ‘Format’ → ‘Columns...’ function,
and not using the tab key.

■■ The ‘Web’ tab reached by clicking the


‘Format Picture’ context-menu command
Reflow: PDFs can also be displayed com- must be used to give graphics and images
fortably on mobile devices following the alternate texts in Word. These image de-
successful ‘reflowing’ of content that is en- scriptions are transferred to the PDF docu-
abled by tagged PDF.
ment by PDFMaker during the creation of
Accessible PDFs also permit safe full-text the PDF.
indexing and searching, since there are no
ambiguities in the text flow.

Automatically meaningful
structures?
Neither accessible PDF nor PDF/A-1a can enable a check
of the tagged PDF to make sure that the structures of a
document are meaningful or correct. Both types of
check can only determine whether structural informa-
tion exists in the specifications of the PDF file – not
whether any structural information found makes
sense.
For this reason, the standard stipulates that structural
information may not be added automatically later on. It
must be imported during the creation of the PDF or
added manually afterwards.
The automatic creation of structures might be possible
without causing problems for very simple PDF files.
However, if a user uses an automated process to recon-
struct a structure, he or she must make sure that the
process is validated.

Creating an accessible PDF file from Word


This example uses Word 2003 and the
PDFMaker. The PDFMaker also accesses
Acrobat 8 PDF settings. The following steps
must be observed to obtain a successful re- Formatting pictures in Word: Right-clicking opens the ‘Format Pic-
sult: ture’ dialog. Alternative text can be entered on the ‘Web’ tab.

54 PDF/A in a Nutshell
PDF/A applications in everyday life

Adjustments in Acrobat Professional:


In some circumstances, certain
additional steps may need to be
taken in Acrobat Professional:

■■ The file is modified in line


with accessibility requirements.
The user may need to set the lan-
guage in the ‘Advanced’ area of
Accessibility and PDF/A: The user either selects the PDFMaker PDF/A-1a checkbox on Accessibility options: This illustration
the ‘Settings’ tab...
the ‘Document Properties’ shows the location of tools for accessible
screen so that screenreader soft- PDF in Acrobat Professional.
ware functions correctly with
the document.

■■ The Acrobat ‘Advanced’ menu


contains further accessibility op-
tions. The user should carry out
the ‘Full Check’ function. If there
are accessibility problems with
the file being checked, Acrobat
informs the user and proposes
ways of repairing them.

■■ The ‘TouchUp Reading Or-


der...’ function can be used if re-
pairs to the structure and alter-
nate image text are required.

■■ Once the PDF document has


successfully passed the accessibili-
ty check, it can be validated in Ac-
robat’s Preflight tool to make sure
…or chooses one of the two PDF/A-1b variants for RGB and CMYK from the menu. In that it is suitable for conversion to
the case of PDF/A-1b, the ‘Enable advanced tagging’ checkbox must be selected in the PDF/A. It can then be converted.
‘Word’ section to allow the generation of accessible PDF files. The PDF/A conversion/validation
is always the very last step. n
■■ The user then selects the ‘Change Con-
version Settings’ command from the ‘Ado-
be PDF’ Word menu. On the ‘Settings’ tab,
the user can activate the PDF/A-1a check-
box or select one of two PDF/A-1b settings
from a pulldown menu. The ‘Enable ad-
vanced tagging’ option in the ‘Word’ area
is active by default and ensures that
PDF/A-1b files meet the tagged PDF re-
quirements.

■■ Choosing ‘Convert to Adobe PDF’ from


the ‘Adobe PDF’ menu now causes the PDF-
Maker to create a PDF/A-1a/ or PDF/A1b- Last step: Finally, the PDF file is converted
compliant file that also meets the require- to PDF/A or tested to see whether it con-
ments for an accessible PDF file. forms to the standard.

PDF/A in a Nutshell 55
PDF/A applications in everyday life

Sticky Note in a PDF/A file: As a rule, com-


ments are permitted in PDF/A-compliant
files. However, multimedia comment types
such as audio comments are prohibited.
Users must be especially careful with col-
ors and transparency when using graphi-
cal annotations.

Interactive PDF files contain future-proof elements that do not


impede clarity.
Interactive elements significantly enhance
the functional scope and usage possibili- Comments and annotations
ties of PDF documents. Interactive ele- PDF/A aims to make all of the content in
ments create connections – whether in PDF files reproducible and permanently
the form of navigation between docu- accessible. This includes comments. They
ments or interaction between companies may not be hidden or set as ‘non-printing’.
and clients or public authorities and citi- However, if a user wants to give a PDF file
zens. permanent comments – that is, he or she
PDF documents can be given hyper- wishes to retain the comments in the PDF/A
links, comments, and form elements. But file – this is technically possible. It is not
to what extent are these interactive func- difficult to define a comment as a note in
tions compatible with PDF/A? The PDF/A Acrobat and generate a valid PDF/A file
standard stipulates that files may only from the document in question.

PixelQuelle.de

56 PDF/A in a Nutshell
PDF/A applications in everyday life

Not all comments are suitable for PDF/A:


Unsuitable annotations include text edit
annotations and annotations with trans-
parencies.

Since note icons and input masks work Hyperlinks are comments
with RGB, the PDF/A file in question must It might be surprising, but from a technical
have an RGB output intent such as ‘sRGB’. point of view hyperlinks are also com-
There are also comment types that are ments. They may not be retained in their
not permitted. It is easy to understand why original form if PDF/A-compliance is to be
text edit comments are prohibited. If such achieved – instead, they must be flattened.
annotations exist, it is to be assumed that a If a user attempts to convert a PDF file that
text correction that should have been made contains links into a PDF/A file, the system
has actually been overlooked. Care should issues two error messages per hyperlink:
also be taken with comments that use ‘Annotation has no Flags entry’ and ‘Anno-
transparency to mark a document. This in- tation not set to print’.
cludes the Highlight Text Tool and the The Preflight correction profiles ‘Remove
stamps delivered with Acrobat, e.g. ‘Ap- all annotations’ and ‘Flatten comments’
proved’. can be useful here. In this case, the result of
both procedures is the same. Once the links Hyperlinks are comments: The illustration
have been discarded, Preflight can usually below shows that this PDF/A conversion
convert the PDF file into a PDF/A file with- cannot be carried out in Preflight because
of links in the document.
out any difficulty. ➔
Removing or flattening annotations: Preflight corrections can re-
move or flatten comments. In the latter case, the annotations are
still visible but they lose their typical comment features.

What can be done if these types of com-


ment are important and need to be trans-
ferred to the PDF/A file being created? The
Acrobat Preflight tool provides a solution.
If you carry out corrections to flatten com-
ments and transparencies before carrying
out a PDF/A conversion, the visual nature
of the comments is retained but the com-
ment functionality is completely lost. For
example, following flattening in the Pre-
flight tool stamp notes can no longer be
opened by double-clicking on them.

PDF/A in a Nutshell 57
PDF/A applications in everyday life

Forms (for more information, see below). In addi-


tion, the tool sometimes reports ‘De-
The PDF/A standard does not prohibit viceRGB’ colors (device-dependent colors)
If forms use interactive ele- forms as such, but there are form field types that result from the colored background of
ments, it is not usually possible that work with actions, and actions can form fields. (Acrobat Professional only pro-
to depict them in a 1:1 manner prevent PDF files from being suitable for vides device-dependent RGB for setting up
in PDF/A. PDF/A-compliance is long-term archiving. Problems are to be ex- form fields and buttons.) What can be
easiest to achieve for simple
pected in the following cases: done?
PDF forms with no calculation
or validation functions, re-
■■ Actions of the type ‘Submit a Form’, ■■ First, actions and JavaScript must be re-
ports, and so on.
‘Import Form Data’, and ‘Reset a Form’ are moved from the source form. This causes
prohibited. This is understandable, since certain restrictions in functionality.
these actions change document content.

■■ Additional actions are not permitted


because they too can change content.

■■ JavaScript actions are prohibited be-


cause they endanger the reproducibility of
the actual state of a file.

An attempt to convert a form PDF to a


PDF/A-1b-compliant document might give
the result shown in the illustration below. Device-dependent: DeviceRGB requires ‘sRGB’ as the output intent.
Such a result is caused by critical events in-
cluding ‘Reset Form’ and the ‘Send’ but- ■■ If the device-dependent RGB colors are
ton. not to prevent compliance with the PDF/A
As is shown by the Preflight result in the standard, the user has to select ‘sRGB’ as
illustration, problems with non-embedded the output intent for the conversion. The
fonts have also occurred in this example. trimmed down, pre-treated form PDF can
This is not so easy to solve at a later stage then be converted to PDF/A-1b. However,

Forms and PDF/A: It is not form fields


themselves that cause problems when
converting files to PDF/A – it is certain
actions and JavaScript used in form fields.
Problems can also occur with fonts and
colors.

58 PDF/A in a Nutshell
PDF/A applications in everyday life

Preflight: The conversion of the form to


PDF/A cannot be carried out because the
tool is unable to embed the required fonts
in the file.

an alternative solution must be found for sues the following error message: ‘Font not
non-embedded fonts in form fields. embedded’.
There is now a tool that can carry out this
Embedding fonts for PDF/A forms task – the Acrobat pdfaPilot plug-in from cal-
Many current tools do not enable the em- las software. Among many other correction
bedding of fonts in PDF form fields. How- functions, it allows form PDF files to be con-
ever, these fonts must be contained within verted into PDF/A-compliant documents. For
the PDF file in order to achieve PDF/A- the process to work, all of the fonts required
compliance. for the PDF document being converted must
The Acrobat Preflight tool cannot embed be available and accessible on the computer.
fonts in form fields. Following a failed at- In addition to the function for embedding
tempt to convert a document containing fonts, pdfaPilot also solves many of the com-
them into a PDF/A document, the tool is- mon color problems that occur in forms. n

PDF/A form: pdfaPilot can create valid


PDF/A forms with embedded fonts.

Access to fonts: All used fonts must be


available on the computer being used for
the conversion in order for font embedding
to be successful.

PDF/A in a Nutshell 59
PDF/A applications in everyday life

Cahloc – PixelQuelle.de

PDF/A for design drawings the PDF specification on which PDF/A-1a


Drawings such as CAD plans and maps are and -1b are based, supports a maximum
ideal candidates for long-term archiving as page size of 200 x 200 inches, which corre-
PDF/A files. Drawings from the fields of ar- sponds to 5.08 x 5.08 meters. As of PDF 1.7,
chitecture and statics must often be kept virtual page sizes of up to 381 kilometers in
for long periods of time. Technical blue- length are permitted, but the PDF/A stan-
prints frequently require long-term storage, dard only permits the maximum page size
too. for PDF 1.4.
PDF 1.4 can handle the large page for- Design drawings can be output to PDF
mats with lengths of several meters that of- or even to PDF/A directly from current
ten occur in technical drawings. PDF 1.4, CAD programs – the PDFMaker is often

Design plans: PDF/A is an appropriate for-


mat for the long-term archiving of circuit
diagrams, construction drawings, street
maps, and many other types of design
plan.

60 PDF/A in a Nutshell
PDF/A applications in everyday life

used for the PDF conversion process. How- Because parties taking part in such
ever, the PDF/A conversion may take place transactions are not in the presence of each The terms ‘digital signature’
in Adobe or in another PDF/A converter. other or witnesses, it is more important and ‘electronic signature’ are
Older plans are often line scans in formats than ever to ensure that digital documents often used interchangeably.
However, the term ‘digital sig-
such as TIFF G4. Such plans can be con- can be reliably checked for authenticity.
nature’ refers to a crypto-
verted to PDF and then to PDF/A using Ac- Electronic signatures enable a completely
graphic, technical process,
robat Professional (or other PDF conver- digital flow of communication and trans-
whereas „electronic signature“
sion solutions). It is often possible to use the actions of a contractual nature. is a legal term.
text recognition function to give drawings Proving authenticity by means of a mark
searchable text during this process. or signature dates back nearly as far as the
first written evidence of mankind. Even the
No 3D models in PDF/A Mesopotamians signed their records with a
Designs created in 2D can be archived as seal or stamp. The practice of signing docu-
PDF/A without any problems. This is not ments with a stamp instead of by hand –
the case for 3D models. Three-dimensional which is still used in China and Japan to-
designs have only been supported since Ac- day – has a history that dates back over
robat 7 (PDF 1.6). They are therefore not millennia. Magnificent wax seals are
permitted in PDF/A-compliant files. known to have been used during the Mid-
dle Ages. Placing one’s own signature at the
bottom of a contract is a relatively new pro-
Electronic signatures cedure, just as general literacy is a relatively
Our everyday life is now digital. Within the new achievement for our culture.
space of a few years, e-commerce has be- But now we are faced with another prob-
come much more widespread and business lem – how can we make digital files into
agreements are now often made online us- legal documents?
ing e-mail. Digital communication be- The simplest way of electronically sign-
tween public authorities and citizens is no ing a file is to place a scanned signature on
longer a thing of the future – just think of a page of the document in the form of an
electronic tax return systems such as image file. This procedure can be legally
EFTPS. recognized, as it is in the United States. ➔

photocase.com/de

PDF/A in a Nutshell 61
PDF/A applications in everyday life

Electronic signatures: Overview of signature types


Simple electronic signatures

Example: Scanned signatures in the form of image files

Advanced electronic signatures

Signatures with cryptographic encryption

Qualified electronic signatures

Signatures with a certificate from a certification


service provider and notification.

Qualified electronic signatures with provider


accreditation

Signatures with a certificate from a cer-


tification service provider and
accreditation.

However, it is obvious that this solution Advanced electronic signatures ensure


provides no security against signature that these requirements are met. They al-
fraud. As a result, the development of pow- low recipients to use cryptographic tech-
erful and reliable systems for digital signa- nology to detect any changes to content.
tures is extremely important. Recipients can also use a cryptographic
key to definitively identify the author of a
Identity, integrity, and time stamps signed message.
Nowadays, a great number of agreements Time is often another criteria for ensuring
and contracts are concluded digitally. To a legally valid transactions or agreements.
great extent, the Internet has now replaced Time stamps that specify the date and time
communication channels such as couriers of a content version are often used for this
and postal services. In today’s world, users task. Electronic signatures and encryption
carry out transactions including quoting, have different purposes. Whereas electronic
ordering, and invoicing in e-business or signatures ensure that the content, involved
motions and rulings in e-government us- party, and time of digital transactions can
ing only digital means. be clearly identified and not changed, en-
Certain basic prerequisites must be ful- cryption protects confidential data from be-
filled for these kinds of transactions. Re- ing viewed by unauthorized parties by, for
cipients of data must be able to easily de- example, only permitting documents to be
termine whether the person who sent the opened with a password.
files really is the person specified. In addi-
tion, they must be able to ascertain that Security levels
the content received has not been changed There are various well-used procedures for
or falsified. These requirements therefore signing digital documents. They differ in
relate to identity (determination of the their scope and the security level that they
writer) and integrity (determination of in- provide for users in cases of uncertainty or
tact content). contention.

62 PDF/A in a Nutshell
PDF/A applications in everyday life

Simple electronic signatures that provided by ‘advanced electronic sig-


Simple electronic signatures include image natures’. Users who wish to use qualified
files of scanned signatures. They only pro- electronic signatures must use specialist
vide low levels of authentication. software such as Openlimit PDF Sign for
Adobe. For more information, see
Advanced electronic signatures www.openlimit.com on the Internet.
These signatures are subject to more strin-
gent requirements: They must enable the PDF/A and digital signatures
detection of content manipulation and it It is not uncommon for users to be faced
must be possible to determine the authen- with a dilemma regarding the correct se-
ticity of the signatory using an electronic quence of steps when using digital signa-
certificate. Advanced electronic signatures tures in PDF/A documents: The creation of
provide less authenticity in a court of law the PDF/A version and the process of digi-
Strictly speaking, even the addition of a
than qualified electronic signatures. tally signing a document both take place at digital signature constitutes a change to a
the end of the work process. A PDF is saved PDF document. However, this is one excep-
Qualified electronic signatures as a PDF/A-compliant document once it tion where the change does not affect the
These signatures provide the highest level of has the precise form in which it is to be ar- PDF/A status or content of the document
security. A qualified certificate is used to as- chived – this means that it can no longer be in question. Digitally signing a PDF/A file
sign each electronic signature to its origina- changed once it has been converted to is therefore both permitted and sensible.
tor. Qualified certificates are signed by a cer- PDF/A. However, a digital signature certi-
tification service provider (CSP/CA). CAs fies the authenticity of a PDF document at a
are required to comply with the require- specified moment in time – thereafter, no
ments of signature legislation, which stipu- changes are permitted. However, in prac-
lates that they must operate their certifica- tice signed PDF/A files do not pose a prob-
tion services in a protected environment lem. It is entirely possible to give a PDF/A
(trust center). file a digital signature without causing the
document to lose its PDF/A-compliant va-
Common Criteria lidity. The PDF/A file is generated first and
‘Common Criteria’ refers to an interna- it is then given a digital signature. This is
tional standard that defines the criteria for because a signature does not actually con-
the evaluation and certification of the secu- stitute a document change – it merely indi-
rity of computer systems in relation to data cates that the document has exactly the
integrity and protection. The Common same form at the point when it was signed.
Criteria standard avoids the need for com- This means that a digital signature does not Adobe Reader 8 enables the digital sign-
ponents and systems to be certified more damage or impede the compliance of a ing of documents as long as the function
than once in different countries. Digital PDF/A document.  ➔ has been activated in Acrobat.
signature solutions can also be certified in
accordance with the Common Criteria
standard.

Digital signatures in PDF with Acrobat


Digital signatures in PDF documents have
been supported since Acrobat 4.0 and Ado-
be Reader 5.1. Whereas older Reader ver-
sions only enable digital signatures to be
viewed and checked, users of Adobe Reader
8 can actually sign documents as long as
the Acrobat function for doing so has been
activated.
With Acrobat, signatures are embedded
into the PDF documents they authenticate.
The maximum security level supported is

PDF/A in a Nutshell 63
PDF/A applications in everyday life

In fact, it underwrites compliance as long ly a problem that is specific to PDF because


as the signature itself meets the require- signatures are embedded; other external
ments of the PDF/A standard. types of signature are not subject to this is-
sue due to the different procedures used to
Challenges in practice apply them to documents.
There are certain fields of application where
digital signatures come up against brick Signing documents again
walls for technical reasons or where users For security reasons, digital signatures are
have to plan the flow of individual process only afforded restricted validity periods.
steps in detail. Since we can assume that computer perfor-
mance will continue to rise constantly in
Multiple signatories years to come, security codes that are ex-
In business and politics, documents often tremely difficult if not impossible to crack
have to be signed in multiple. This might be today might be eradicated in a few hours in
the case if all members of a board of direc- the future by simple trial and error. For this
tors have to sign a ruling or if several secre- reason, documents have to be signed again
taries of state need to ratify a resolution. after a certain period of time to keep the
This causes a technical problem for digital validity of the signature up-to-date. This
signatures: Each time a signature is added makes the previous digital signature inval-
to a PDF document, the validity of the pre- id since, as explained above, each newly
vious digital signature is nullified, since the added digital signature invalidates any pre-
addition of the new signature constitutes a vious signature.
change to the PDF document. This is main-
Archiving signed PDFs as PDF/A files
Users who receive signed documents that
are not yet PDF/A-compliant face practical
problems if they wish to archive them as
PDF/A documents. The only solution is to
have such documents signed again follow-
ing conversion to PDF/A.
Another complication arises when a PDF
form (such as a contract) that needs to be
signed in order to be valid then needs to be
stored in an archive as a PDF/A file. This is
another case where the document in ques-
tion needs to be converted to PDF/A before
it is signed. This means that JavaScript
functions in the form need to be removed
and fonts used in filled text fields need to be
embedded before digital signing. n

Tracking changes in Acrobat: To find out


when and what changes have been
made to a PDF document following the
addition of a signature, users can go to
the ‘Signature Properties’ area. This dia-
log box is opened by clicking on a signa-
ture. The ‘View Signed Version...’ option
in the ‘Signature Properties’ dialog box
can be used to display saved versions of a
PDF file.

64 PDF/A in a Nutshell
The outlook:
PDF/A in the future
PDFs are extremely practical and it is diffi-
cult to argue against their usefulness for
many application areas. PDF as a format
has ‘grown up’ over a period of 14 years,
to incorporate technical fine-tuning of the
PDF format in a manner that allows ar-
chiving.
The second part of the PDF/A standard
6.
and today the format itself and the software – PDF/A-2 – is planned for 2009. It is im-
required to use it take various mature portant to note that the second part will
forms. In addition, the adoption of the not invalidate PDF/A-1; PDF/A-1-compli-
PDF/A standard has made PDF a highly re- ant documents will still be valid and reli-
liable format, both for today and for the fu- able archive files. It will not be necessary to
ture. Does this make the issue of technical migrate existing PDF/A-1 archives to
formats and procedures for the long-term, PDF/A-2 once the new PDF/A standard is
secure archiving of digital documents a published. Doing so would benefit nobody.
closed topic? What has PDF/A achieved However, in some cases it might make sense
and what remains to be done? to archive new archive documents as
PDF/A-2 files. For example, PDF/A-2 will
Enhancements in PDF/A-2 support the JPEG2000 image compression
The PDF/A standard constitutes an ex- format. If files contain image data in
tremely solid base, at least regarding the JPEG2000 format, it is clearly sensible to
unambiguous and reliable visual repro- archive that data in JPEG2000, thereby
duction of content. Without a doubt, the avoiding the need to carry out a recompres-
PDF/A standard will be developed further sion into JPEG (which can cause albeit low

Wichert – PixelQuelle.de

PDF/A in a Nutshell 65
The outlook: PDF/A in the future

Claudia Hautumm – PixelQuelle.de

levels of data loss) or ZIP (which increases PDF/A standard intentionally permits digi-
the amount of memory required to save tal signatures, but just as deliberately re-
files). frains from stipulating actual implementa-
tion methods. One important reason why
Looking towards PDF/A-3 there is not yet an ISO standard on digitally
A third part to the standard is already being signing PDF/A documents is that require-
discussed – PDF/A-3. This part should deal ments and legislation for digital signatures
with ‘dynamic’ PDF documents. PDF/A-1 differ from country to country. Despite the
deals exclusively with PDF documents prevalence of economic processes that are
whose content and depiction does not generally globalized, this area is subject to
change and may not be modified (as is the an extremely high degree of disparity and
case with paper documents). In the case of incertitude. In addition, digital signature
PDF files that contain audio or video data, technology is still a long way from being
self-playing animated presentations, ‘walk- mature, and is not as wide-spread or easy to
able’ 3D models, or complex form logic with use as PDF technology, which is accessible to
database connections, it is only possible to all and sundry.
preserve a snapshot of a certain display point
or specific content form when printing them Full-text searching
or archiving them as PDF/A-1. This is hardly Another important aspect is full-text search-
an ideal solution. It is certain to take several ing. This function usually works so well with
years to complete and adopt the PDF/A-3 traditional PDFs that we take its permanent
standard, since depicting dynamic content availability for granted. However, there are
is far more difficult than capturing static, always a couple of actual hits that are, in
two-dimensional visual content. fact, missed – even if we do not notice it,
precisely because the hits in question are not
PDF/A-1 developments found by the text search. The failure of a full-
In any case, there are bound to be new de- text search to find certain hits can result
velopments for PDF/A-1 itself – not for the from something as basic as a typing error
PDF/A-1 standard, but for related issues. (for example, if the name ‘Smith’ is typed as
‘Smiht’). However, perfectly legitimate dif-
Digital signatures ferences in spelling or punctuation can also
One closely monitored issue is the interac- cause a hit to be missed: For example, the
tion of PDF/A-1 and digital signatures. The number one thousand point zero is written

66 PDF/A in a Nutshell
The outlook: PDF/A in the future

as 1,000.0 in the US, 1.000,0 in Germany, as PDF/A is widely used, its popularity will
and 1’000,0 in Switzerland. There is also a create a market where providers of solutions
great variety of ways to write down tele- and services can make a profit. There is no
phone numbers – various countries use need to worry about the future of the format
spaces, brackets, or hyphens to improve unless this market shrinks to a critical size.
readability or conform with different format Unwanted stumbling blocks in the form of
rules. The PDF/A standard requires all char- patents and other industrial property rights
acters to have unique Unicode names for the are increasingly less likely than for many
PDF/A-1a compliance level. However, it is other formats, even if they cannot be com-
not possible for the PDF/A standard to en- pletely ruled out in the society in which we
sure that a unique Unicode character-to- live. Consider, for example, Unisys and its
code assignment is correct. Only a human LZW patent that has only recently expired,
operator can decide whether or not an X is Forgent and its JPEG patent claims, or Mi-
being passed off as a U. crosoft, which had to pay over one and a half
billion US dollars to Alcatel-Lucent in a dis-
Structured content pute over the widely used MP3 format. In
There is also room for improvement in rela- any case, one thing’s for sure: In fifty or one
tion to the structure of content in PDF files: hundred years time, all of these patents will
Countless documents that require archiving have expired.
do not only contain structured content (a However, one question remains: Have we
reading order and important specifications already reached the critical PDF/A target re-
such as title, image caption, or sequential quired to ensure an enduring market, and, if
body text) but also include specific, uniquely not, when will the target be reached? As far as
identifiable data specifications. For example, the rollout and practical implementation of
telephone bills always contain fields such as PDF/A are concerned, we are still only begin-
‘Customer Number’ and ‘Invoice Number’, ning. However, in terms of the advantages of-
and they always state the amount owed. fered by PDF/A, there cannot be any doubt
Tagged PDF (which is also specified in PDF/A- that, by 2010, PDF/A will be so widely used
1a for the structure of content) already helps a that this critical target is certain to be met.
great deal in this area. However, it would be This is why: No format other than PDF and/
even more useful if this kind of data specifi- or PDF/A is so ideally suited, practical, and
cation could be determined and read directly widespread when it comes to archiving the
and uniquely, as data records are read from a rapidly increasing number of digital docu-
database. Format-related ambiguities also ments. Moreover, no other format has been
need to be eliminated. In fact, the current adopted as an ISO standard, which makes
state of technology enables these tasks to be manufacturers far more likely to use it.
accomplished, but a standard that makes As mentioned previously, the PDF/A
documents and software interoperable is still landscape will continue to develop in vari-
required. ous ways in order to achieve technical
progress and meet application-specific re-
PDF/A in one hundred years time quirements. However, it is important to
The aspects mentioned above will probably note that this will not result in a need to
be implemented during the next five or ten revise the basic principles of the standard:
years. But what will the world of PDF/A be The ISO PDF/A-1 standard provides a very
like in fifty or even one hundred years time? solid basis for the field – even in the light of
For example, what is the probability of a per- planned additional parts to the PDF/A
son interested in the beginnings of PDF/A norm. It describes a strong foundation that
being able to find and read a printout (or mi- is not subject to noteworthy change in the
crofilm or TIFF version) or a PDF/A-1 ver- long term. This fact is sure to facilitate the
sion of this publication in the year 2107? strategic and economic justification of in-
This can partly be answered by a harsh truth: vestment when implementing PDF/A-bases
Money makes the world go round. As long archiving processes.  n

PDF/A in a Nutshell 67
What the error messages mean
Preflight results and troubleshooting for PDF/A
During PDF/A conversion or validation in Acrobat 8 Profes- contain 3D comments. Remedy: The Acrobat 8 Preflight mod-
sional, the Preflight tool informs the user of problems that ule can be used to discard 3D comments.
prevent compliance with the standard. Although the user re-
ceives a short explanation of each error in the info window, ■■ Additional actions (AA) used: PDF files can dynamically
not all descriptions are easy to understand. For this reason, alter their content during visualization. Actions can be con-
this alphabetical list of all PDF/A error messages that might tained in the PDF file for this purpose. The PDF/A standard
occur is intended to help provide an overview of why the tool stipulates that the visualization of a document must be guar-
has declared a document to be non-PDF/A-compliant. In ad- anteed and always the same. For this reason, active content is
dition, each error message is followed by a description of not permitted in PDF/A files. The only exception is elements
measures that users can carry out in advance or later on in for page navigation. Remedy: The Acrobat PDF Optimizer can
order to enable the production of a valid PDF/A file. be used to remove these actions.

■■ Alternate image present: An image in a PDF file can con-


tain ‘alternate interpretations’. This allows monitor visualiza-
tion to be accelerated if, for example, high-resolution images
also have a low-resolution alternate interpretation. However,
the visualization of the PDF is then no longer uniform but var-
ies depending on the output device used. Because PDF/A de-
mands the unambiguous visualization of all page objects, al-
ternate interpretations of images are not permitted in PDF/A
files. Remedy: The Acrobat PDF Optimizer contains an option
called ‘Discard all alternate images’ in the ‘Discard Objects’
area.

■■ Annotation has C entry but no OutputIntent present: In


this PDF file, at least one comment uses a DeviceRGB-defined
color for drawing the outline of a comment field. PDF/A does
not differentiate between page objects and comments for col-
ors, so the requirements for text, graphics, and images also ap-
ply here. Remedy: Such files can normally be converted to
PDF/A using the ‘sRGB’ output intent.

■■ Annotation has CA value that is not 1.0: A comment sym-


Puzzling error messages: Not all explanations that the Preflight tool provides for PDF/A prob- bol in this PDF file is set to ‘transparent’ so that the background
lems can be understood without further explanation. This list intends to shed some light on er- shows through the symbol when it is displayed on a monitor or
ror messages users might see. printed out. The PDF/A standard stipulates that all features
used in a PDF file must be displayed in a single unique way on
■■ 3D annotation used: Three-dimensional comments can a monitor or in a printout. Because this cannot be ensured in
contain complex 3D models that generally originate from CAD the case of transparent objects and their backgrounds, trans-
programs. They can be embedded as comments in PDF files parency is not permitted in PDF/A files. Remedy: The Acrobat
and rotated interactively on a monitor, for example. Interactive PDF Optimizer enables comments to be discarded.
rotation is obviously not possible when a document is printed
out. Because the PDF/A stipulates that a PDF must look exactly ■■ Annotation has IC entry but no OutputIntent present:
the same (to the greatest extent possible) regardless of whether In this PDF file, at least one comment uses a DeviceRGB-de-
it is displayed on a monitor or printed out, PDF/A files may not fined color for drawing the background of a comment field.

68 PDF/A in a Nutshell
What the Preflight error messages mean

PDF/A does not differentiate between page objects and com- ment must be identical whether displayed on a monitor or
ments for colors, so the requirements for text, graphics, and output using a printer, comments in a PDF/A document must
images also apply here. Remedy: Such files can normally be not be defined as not to be displayed. Remedy: These com-
converted to PDF/A using the ‘sRGB’ output intent. ments can be discarded using the Acrobat PDF Optimizer
(‘Discard all comments, forms and multimedia’).
■■ Annotation has no Flags entry: A comment element in a
PDF file must contain certain additional information that de- ■■ Annotation’s AP (appearance) contains only N entry is
termines its appearance when displayed on a monitor or not true: Comments in PDF files can contain differing visu-
printed out. This information is missing for the comments in alization methods that are used, for example, depending on
this PDF file. This means that it is unclear whether/how the whether the mouse cursor is moved over a comment symbol
comment will be rendered when the document is displayed or or the comment symbol is clicked. These effects are, of course,
printed out. The PDF/A standard stipulates that all informa- not possible when a document is output on a printer. Because
tion required for visualizing a comment must be contained the PDF/A standard stipulates that the visualization of a doc-
within the PDF file in which the comment appears. Remedy: ument must be guaranteed and that it must appear identical
These comments are normally invisible components of hy- when output on a printer and when displayed on a monitor,
perlinks. They can be removed using the Acrobat PDF Opti- comments in a PDF/A document may not have different visu-
mizer (‘Discard all comments, forms and multimedia’). alization variants for mouse effects. Remedy: The Acrobat
Professional PDF Optimizer has an option called ‘Discard all
■■ Annotation Hidden flag set: Comments can set to ‘Hid- comments, forms and multimedia’ in the ‘Discard User Data’
den’ to prevent them from being displayed on the monitor. area. This option corrects this error.
The ‘Hidden’ flag is used to do this. Because the PDF/A stan-
dard stipulates that the visualization of a document must be ■■ Author mismatch between Document Info and XMP
ensured and because it is impossible to guarantee that the metadata: In this PDF document, the data on the author in
‘Hidden’ flag will be correctly evaluated, PDF/A documents the XMP area does not match the data in the general docu-
may not use this flag for comments. Remedy: These comments ment properties. The PDF/A standard stipulates that docu-
can be discarded using the Acrobat PDF Optimizer (‘Discard ment information must exist in the XMP area. If this data is
all comments, forms and multimedia’). also contained in the document properties, it must be identi-
cal to the entries in the XMP area. Remedy: New PDF/A con-
■■ Annotation Invisible flag set: Comments in PDF files version.
can be set to ‘invisible’ to prevent them from being displayed
on a monitor. The ‘Invisible’ flag is used to do this. Because ■■ Belongs to transparency group: A group of page objects
the PDF/A standard stipulates that the visualization of a doc- is defined as ‘transparent’. The PDF/A standard stipulates
ument must be ensured and because it is impossible to guar- that all features used in a PDF file must be displayed in a sin-
antee that the ‘Invisible’ flag will be correctly evaluated, gle unique way on a monitor or in a printout. Because this
PDF/A documents may not use this flag for comments. Rem- cannot be ensured in the case of transparent objects and their
edy: These comments can be discarded using the Acrobat backgrounds, transparency is not permitted in PDF/A files.
PDF Optimizer (‘Discard all comments, forms and multime- Remedy: Adobe Acrobat Professional (Version 6, 7, or 8) in-
dia’). cludes a flattener module that can be used to remove trans-
parencies.
■■ Annotation not set to print: Comments in PDF files can
be defined as non-printing to prevent them being printed out. ■■ Bits per color component > 8: Images with a color depth
Because the PDF/A standard stipulates that the visualization other than 8 bits are used in this PDF. Color depths that are
of a document must be ensured and that a document must be not 8 bits are not reliably supported by all visualization de-
identical whether displayed on a monitor or output using a vices (monitors and printers). In addition to this, such fine
printer, comments in a PDF/A document must not be defined nuances cannot be visualized technically on most devices in
as not to be printed. Remedy: Current PDF-to-PDF/A con- a way that ensures that differing color depths do not lead to
verters correct this error during the conversion process. differences in color or brightness when visualized. For this
reason, only 8 bit images are permitted in PDF/A files. Rem-
■■ Annotation NoView flag set: Comments in PDF files can edy: The PDF must be regenerated using images that have an
be set to ‘NoView’ to prevent them from being displayed on a 8 bit color depth. Acrobat 8’s Preflight module also has a cor-
monitor. Because the PDF/A standard stipulates that the vi- rection option that reduces the color depth of images from 16
sualization of a document must be ensured and that a docu- bits to 8 bits.

PDF/A in a Nutshell 69
What the Preflight error messages mean

■■ CharSet missing or incomplete for Type 1 font: A font is letters and other characters used in PDF texts require ‘fonts’
not fully embedded and contains no list of embedded sym- that determine their exact appearance when visualized. The
bols (CharSet). If a font in Type 1 format is not fully embed- characters stored in a font are allocated number codes in ac-
ded, it must contain a list of the embedded characters to en- cordance with an allocation table. These number codes are
able conversion to PDF/A The list must include all characters used to display the characters in the PDF that uses them.
used in this font in the PDF file. In this case, a font is not These allocation tables are made up differently depending
fully embedded in this PDF file and its list of embedded sym- upon the font format (PostScript Type 1, Type 3, or TrueType)
bols is missing or is incomplete. Remedy: In order to resolve and are known as ‘encodings’. MacRoman (Macintosh) and
this problem, the PDF file must be created again using a dif- WinAnsi (Windows) are standard encodings. ‘CID fonts’ can
ferent font or with the same font but in its complete form. use encodings that deviate from these standards. The PDF/A
Alternatively, the incomplete font may be used, but only with standard stipulates that a font that uses its own encoding
the relevant CharSet. must get the encoding in question from a corresponding table
(CMap). This PDF does not use standard encoding and does
■■ CIDset in subset font is incomplete: A font is not fully not contain an encoding table (CMap). This PDF can there-
embedded and contains no list of embedded symbols (Char- fore not be converted to PDF/A. Remedy: In order to resolve
Set). If a font in CID 1 format is not fully embedded, it must this problem, the PDF file must be created again using a dif-
contain a list of the embedded characters to enable conver- ferent font or with the same font but in its complete form.
sion to PDF/A. The list must include all characters used in Alternatively, the incomplete font may be used, but only with
this font in the PDF file. In this case, a font is not fully embed- the relevant CharSet.
ded in the PDF file and its list of embedded characters is in-
complete. Remedy: In order to resolve this problem, the PDF ■■ CMYK used but PDF/A OutputIntent not CMYK: Device-
file must be created again using a different font or with the dependent color (DeviceCMYK), but no CMYK output intent.
same font but in its complete form. Alternatively, the incom- Because the PDF/A standard stipulates that colors must appear
plete font may be used, but only with the relevant CharSet. the same (as far as is technically possible) regardless of the out-
put device, either a PDF/A document must only contain de-
■■ CIDset in subset font missing: A font is not fully embed- vice-neutral colors or the color properties of the output device
ded and contains no list of embedded symbols (CharSet). If a must be defined using an output intent profile. If a document
font in CID 1 format is not fully embedded, it must contain a contains DeviceRGB or DeviceCMYK colors, an output intent
list of the embedded characters to enable conversion to of the same type must therefore exist. Remedy: Preflight con-
PDF/A. In this case, a font is not fully embedded in the PDF tains a correction option that converts the alternate visualiza-
file but the list of embedded characters is missing. Remedy: In tion to CMYK (SWOP). This correction must be duplicated
order to resolve this problem, the PDF file must be created and an RGB color space such as sRGB must be used as the tar-
again using a different font or with the same font but in its get. The correction can then be assigned to a profile. The alter-
complete form. Alternatively, the incomplete font may be nate visualization of the spot color can then be modified. pd-
used, but only with the relevant CharSet. faPilot also solves this problem.

■■ CIDSystemInfo and CMap dict not compatible: A font is ■■ CMYK used for alt. color but PDF/A OutputIntent not
not fully embedded and contains no list of embedded sym- CMYK: A spot color has been defined in DeviceCMYK but
bols (CharSet). The characters stored in a font are allocated the output intent is not defined for CMYK. Because the
number codes in accordance with an allocation table. These PDF/A standard stipulates that colors must appear the same
number codes are used to display the characters in the PDF (as far as is technically possible) regardless of the output de-
that uses them. No allocation table has been specified for a vice, either a PDF/A document must only contain device-
font in this PDF. Remedy: In order to resolve this problem, neutral colors or the color properties of the output device
the PDF file must be created again using a different font or must be defined using an output intent profile. If a document
with the same font but in its complete form. Alternatively, contains DeviceRGB or DeviceCMYK colors, an output in-
the incomplete font may be used, but only with the relevant tent of the same type must therefore exist. Remedy: Preflight
CharSet. contains a correction option that converts the alternate visu-
alization to CMYK (SWOP). This correction must be dupli-
■■ CMap not embedded for custom CMap: A font is not cated and an RGB color space such as sRGB must be used as
fully embedded and contains no list of embedded symbols the target. The correction can then be assigned to a profile.
(CharSet). This font has no clear information regarding the The alternate visualization of the spot color can then be mod-
assignment of characters to letters (the CMap is missing). The ified. pdfaPilot also solves this problem.

70 PDF/A in a Nutshell
What the Preflight error messages mean

■■ Compressed object streams used: Since PDF 1.5, which bat PDF Optimizer contains an option called ‘Discard all
Adobe introduced with Acrobat 6, some objects in PDF files form submission, import and reset actions’ that corrects this
can be compressed as object streams. This technique is used problem.
in this PDF. The PDF/A standard only permits objects that
are compatible with PDF 1.4. Cross-object compression is ■■ Contains action of type Sound: Contains audio data
therefore not permitted in PDF/A files. Remedy: Use the Ac- (sound). PDF files can dynamically alter their content during
robat PDF Optimizer to save the file as a PDF 1.4 file. visualization. Actions can be contained in the PDF file for
this purpose. This PDF file contains an action for playing
■■ Contains action of type ImportData: Active content that sound. The PDF/A standard stipulates that the visualization
imports data from an external file. PDF files can dynamically of a document must be guaranteed and always the same. For
alter their content during visualization. Actions can be con- this reason, active content is not permitted in PDF/A files.
tained in the PDF file for this purpose. The PDF/A standard The only exception is elements for page navigation. Remedy:
stipulates that the visualization of a document must be guar- The Acrobat PDF Optimizer contains an option called ‘Dis-
anteed and always the same. For this reason, active content is card all comments, forms and multimedia’ that can be used
not permitted in PDF/A files. The only exception is elements to remove audio data.
for page navigation. Remedy: The PDF must be redesigned
and generated again so that all content is present within the ■■ Creation Date mismatch between Document Info and
file itself. XMP metadata: The ‘CreateDate’ entry in the XMP docu-
ment information deviates from the ‘Created’ entry in the
■■ Contains action of type Launch: Active content that trig- document properties. The PDF/A standard stipulates that
gers another application. PDF files can dynamically alter document information must exist in the XMP area. If this
their content during visualization. Actions can be contained data is also contained in the document properties, it must be
in the PDF file for this purpose. This PDF file contains an ac- identical to the entries in the XMP area. Remedy: A PDF file
tion for launching another application. The PDF/A standard can contain descriptive document information including the
stipulate that the visualization of a document must be guar- author, creation date, title, and other details. This informa-
anteed and always the same. For this reason, active content is tion can be opened using the ‘File’ menu and changed in the
not permitted in PDF/A files. The only exception is elements general Document Properties dialog. Otherwise, the PDF/A
for page navigation. Remedy: The PDF must be redesigned file can be recreated from scratch.
and generated again so that all content is present within the
file itself. ■■ Creator mismatch between Document Info and XMP
metadata: The ‘PDF creator’ entry in the XMP document in-
■■ Contains action of type Movie: Active content that shows formation deviates from the corresponding entry in the doc-
a movie in another window. PDF files can dynamically alter ument properties. The PDF/A standard stipulates that docu-
their content during visualization. Actions can be contained ment information must exist in the XMP area. If this data is
in the PDF file for this purpose. This PDF file contains an ac- also contained in the document properties, it must be identi-
tion for playing a movie. The PDF/A standard stipulates that cal to the entries in the XMP area. Remedy: New PDF/A con-
the visualization of a document must be guaranteed and al- version.
ways the same. For this reason, active content is not permit-
ted in PDF/A files. The only exception is elements for page ■■ Custom annotation used: A comment in the PDF docu-
navigation.  Remedy: The Acrobat PDF Optimizer contains ment does not use a standard PDF comment type. The PDF
an option called ‘Discard all comments, forms and multime- specification allows PDFs to contain comments in custom
dia’ that can be used to remove movies. formats. This function can be used in specialized applications
to position special, additional elements in PDF files. These
■■ Contains action of type ResetForm: Active content that comments much have the comment type ‘Custom’. Use is sel-
influences the content of form fields. PDF files can dynami- dom made of these options. Because custom comments can
cally alter their content during visualization. Actions can be only be visualized on specialized output devices, they may
contained in the PDF file for this purpose. This PDF file con- not be used in PDF/A files. Remedy: These comments must be
tains an action for emptying form fields (ResetForm). The discarded.
PDF/A standard stipulates that the visualization of a docu-
ment must be guaranteed and always the same. For this rea- ■■ Destination profiles in OutputIntents differ: There are
son, active content is not permitted in PDF/A files. The only multiple output intents with different profiles. Because the
exception is elements for page navigation. Remedy: The Acro- PDF/A standard stipulates that colors must appear the same

PDF/A in a Nutshell 71
What the Preflight error messages mean

(as far as is technically possible) regardless of the output de- files can dynamically alter their content during visualization.
vice, either a PDF/A document must only contain device- This active content can be contained in PDF files in the form
neutral colors or the color properties of the output device of JavaScript, for example. This is the case in this PDF.
must be defined using an output intent profile. If a document JavaScript is often used in connection with active form ele-
contains DeviceRGB or DeviceCMYK colors, an output in- ments (for example, buttons). The PDF/A standard stipulates
tent of the same type must therefore exist. The output intent that the visualization of a document must be guaranteed and
must describe the color properties of the output device using always the same. For this reason, active content is not permit-
an ICC profile. To ensure unambiguity, a PDF/A file may only ted in PDF/A files. Remedy: The Acrobat PDF Optimizer con-
have different output intents if all of the output intents used tains the ‘Discard all JavaScript actions’ option in the ‘Dis-
have the same ICC profile. Remedy: Preflight contains a cor- card Objects’ area. This option can be used to correct the
rection option that converts the alternate visualization to error.
CMYK (SWOP). This correction must be duplicated and an
RGB color space such as sRGB must be used as the target. The ■■ Document is damaged and needs repair: The PDF file is
correction can then be assigned to a profile. The alternate vi- incorrectly formatted. Every PDF/A file must be a basically
sualization of the spot color can then be modified. pdfaPilot correct PDF file. The file that is currently open does not con-
can also carry out this task. form with the PDF specification and can therefore not be
converted to PDF/A. Remedy: The problem can possibly be
■■ Device process color used but no PDF/A OutputIntent: resolved by opening the file in Adobe Acrobat and saving it
Device-dependent color exists, but no output intent. Because once again using the ‘Save As’ option. Otherwise, it might be
the PDF/A standard stipulates that colors must appear the possible to clean it up using the ‘Save Optimized As’ option.
same (as far as is technically possible) regardless of the output
device, either a PDF/A document must only contain device- ■■ Document is encrypted. The document is encrypted and
neutral colors or the color properties of the output device cannot be analyzed. PDF files can be encrypted in order to
must be defined using an output intent profile. If a document password-protect certain functions. This means that a PDF
contains DeviceRGB or DeviceCMYK colors, an output in- can be displayed on the monitor without restrictions but a
tent of the same type must therefore exist. Remedy: Preflight password is required in order to print or modify it. Encryp-
contains a correction option that converts the alternate visu- tion is not permitted in a PDF/A file, as its visualization would
alization to CMYK (SWOP). This correction must be dupli- then be dependent upon information stored externally (i.e. a
cated and an RGB color space such as sRGB must be used as password).  Remedy: The PDF file cannot be converted to
the target. The correction can then be assigned to a profile. PDF/A in this form. If the required password is known, the
The alternate visualization of the spot color can then be mod- PDF file’s password protection can be removed in Adobe Ac-
ified. pdfaPilot can also carry out this task. robat and the file can then be saved.

■■ Device process color used in alt. color space but no ■■ Embedded PostScript operator: The document uses
PDF/A OutputIntent: A spot color has been defined as a de- PostScript code for the page description. PostScript code can
vice color, but no output intent exists. Because the PDF/A also be used in PDF files. This option was primarily used at
standard stipulates that colors must appear the same (as far the beginning of the PDF format era by programs that did not
as is technically possible) regardless of the output device, ei- offer full PDF support. However, there are very few programs
ther a PDF/A document must only contain device-neutral that can visualize this PostScript code on a monitor. Post-
colors or the color properties of the output device must be Script code is used in this PDF document to describe the page
defined using an output intent profile. If a document contains objects. Because the PDF/A standard stipulates that all com-
DeviceRGB or DeviceCMYK colors, an output intent of the ponents of a PDF file must be reliably visualized, the use of
same type must therefore exist. Remedy: Preflight contains a PostScript code is not permitted in PDF/A files. Remedy: This
correction option that converts the alternate visualization to error occurs very rarely. Current PDF-to-PDF/A converters
CMYK (SWOP). This correction must be duplicated and an solve this problem during the conversion process by removing
RGB color space such as sRGB must be used as the target. The the PostScript entries.
correction can then be assigned to a profile. The alternate vi-
sualization of the spot color can then be modified. pdfaPilot ■■ EmbeddedFiles entry in Names dictionary: The docu-
can also carry out this task. ment contains an embedded file. In PDF files, other files can
be embedded as an ‘attachment’ in a similar manner as with
■■ Document contains JavaScripts: Active content in the an e-mail. The corresponding program is required to view
form of JavaScript changes the visualization of pages. PDF these files (for example, Microsoft Word if a Word file is em-

72 PDF/A in a Nutshell
What the Preflight error messages mean

bedded). Because the PDF/A standard stipulates that it must ment must be guaranteed and that it must be identical when
be possible to visualize all components of a PDF file without output on a printer and when displayed on a monitor, a PDF/A
the aid of other software, file attachments are not permitted document must only contain form fields with a visual repre-
in PDF/A files. Remedy: The Acrobat PDF Optimizer con- sentation. Remedy: The Acrobat PDF Optimizer contains an
tains an option called ‘Discard file attachments’ in the ‘Dis- option called ‘Discard all comments, forms and multimedia’
card Objects’ area. This option corrects the error. that can be used to remove form fields.

■■ Encoding entry prohibited for symbolic TrueType font: ■■ Form field’s AP (appearance) contains only N entry is
This symbol font contains an allocation table for ‘normal’ not true: Form fields in PDF files can contain differing visu-
fonts. The PDF/A standard stipulates that a TrueType font alization methods that are used, for example, depending on
that is also a symbol font may not use an entry for this type of whether the mouse cursor is moved over a form field or the
standard encoding, since standard encoding only defines form field is clicked. These effects are, of course, not possible
‘normal’ characters and not the special characters contained when a document is output on a printer. Because the PDF/A
in symbol fonts. This PDF can therefore not be converted to standard stipulates that the visualization of a document must
PDF/A. Remedy: In order to resolve this problem, the PDF file be guaranteed and that it must appear identical when output
must be created anew, using a different font. on a printer and when displayed on a monitor, form fields in
a PDF/A document may not have different visualization vari-
■■ File header not compliant with PDF/A: The PDF file ants for mouse effects. Remedy: The Acrobat Professional
header (PDF version entry or binary digit string) is not com- PDF Optimizer has an option called ‘Discard all comments,
pliant. The PDF/A standard stipulates that a PDF file must forms and multimedia’ in the ‘Discard User Data’ area. This
comply with the general file header regulations in the PDF option corrects this error.
specification (1.6). Remedy: The ‘Save As’ command in Acro-
bat can be used to solve this problem. ■■ Glyphs missing in embedded font: A font does not con-
tain all of the characters required. The letters and other char-
■■ File size is above 2GB: The file is too large (the maximum acters used in PDF texts require ‘fonts’ that determine their
permitted size is 2GB). Extremely large files may lead to ren- exact appearance when visualized. This PDF contains an em-
dering problems when the files in question are printed out or bedded font, in which however not all symbols that are used
displayed on a monitor. The maximum file size is therefore in texts using this font, are described. This means that there
limited to 2 GB in the PDF/A standard. Remedy: No repair is no visual representation for the characters that are missing.
possible. It may be possible to recreate the PDF file with a The PDF/A standard stipulates that all fonts used must be
more effective compression type. embedded and that a visual representation for all used char-
acters must exist. This PDF can therefore not be converted to
■■ Font not embedded (and text rendering mode not 3): PDF/A. Remedy: To resolve this problem, the PDF file must be
Text uses a non-embedded text. The letters and other charac- created anew.
ters used in PDF texts require ‘fonts’ that determine their ex-
act appearance when visualized. A text in this PDF uses a ■■ ICC profile version 4 or newer: This file uses an ICC pro-
font that is not embedded into the PDF. It is therefore only file for color definition that has a newer version than is per-
possible to visualize this PDF correctly if this font is installed mitted by PDF/A. The file can therefore not be converted to
on the computer or printer being used. Because PDF/A stipu- PDF/A. The ICC profile may also be defective. Remedy: Note
lates that a PDF may not require external dependencies in that it is very unusual for ICC profiles that are more recent
order to be visualized, PDF/A files must not contain any fonts than Version 3 to be used. Tools such as the Acrobat Preflight
that are not embedded. The only exception is text that is not module can be used to find out which component uses the
displayed but is merely used for the full-text search instead profile in question. This enables the error to be eliminated.
(text rendering mode = 3). Remedy: The PDF must be regen- The object in question must be recreated and the PDF file
erated with all used fonts embedded. must then be regenerated.

■■ Form field does not have appearance dict: Form field is ■■ ID in file trailer missing or incomplete: No file ID entry
‘invisible’: A PDF file can contain form fields. These form available. Every PDF file should contain an internal ID that
fields must contain additional information to ensure that gives it a certain uniqueness and is altered each time the doc-
they can be visualized. This PDF contains form fields that do ument is changed. The PDF/A standard requires the presence
not contain the required additional information. Because the of this ID. Remedy: The ‘Save As’ command in Acrobat can
PDF/A standard stipulates that the visualization of a docu- be used to solve this problem.

PDF/A in a Nutshell 73
What the Preflight error messages mean

■■ Image has OPI information: OPI (Open Prepress Inter- ment information. This entry can also be displayed with
face) is a procedure used in prepress. It involves the replace- Adobe Acrobat. To display it, the user must choose ‘Proper-
ment of images with alternate images when printing via an ties’ from the ‘File’ menu in Adobe Acrobat and then click the
OPI server. Since PDF files that carry OPI information can ‘Additional Metadata...’ button. The entry appears in the ‘Ad-
give different results depending on the output chosen (dis- vanced’ section under http://www.aiim.org/pdfa/ns/id/. This
play on a monitor, printing on a desktop printer, or printing group must contain a ‘pdfaid:conformance:’ entry that speci-
via an OPI server), OPI comments are not permitted in PDF/A fies ‘A’ or ‘B’ and declares that the file must comply with either
files. Remedy: OPI comments can be removed using the Acro- PDF/A-1a or PDF/A-1b. PDF/A-1a files are always compliant
bat PDF Optimizer. with PDF/A-1b. This PDF contains a conformity entry, but it
is not B or A. Remedy: Preflight can correct this entry.
■■ Inadequate namespace URI for PDF/A entry: The PDF/A
entry in the document information is incorrectly formatted. ■■ Interpolate key for image not false: An image has an
A PDF/A file must have a corresponding entry in its docu- ‘interpolation key’ that is not supported by PDF/A viewers.
ment information. This entry can also be displayed with Ado- The rendering or printout of an image is based upon the reso-
be Acrobat. To display it, the user must choose ‘Properties’ lution of the output device. This means that it depends on the
from the ‘File’ menu in Adobe Acrobat and then click the ‘Ad- vertical or horizontal resolution if a monitor is being used
ditional Metadata...’ button. The entry appears in the ‘Ad- and on the thickness of the lines with which the printing
vanced’ section under http://www.aiim.org/pdfa/ns/id/. The drum can be ‘imaged’ if a laser printer is being used. If the
PDF entry is available in this document but it is incorrectly image resolution is significantly less than the resolution of
formatted. Remedy: New PDF/A conversion. the output device, additional pixels must be added. This pro-
cess is known as interpolation. Interpolation is normally car-
■■ Incorrect PDF/A version number (must be 1): The PDF/A ried out in accordance with a standard procedure. However,
entry in the document information is incorrectly formatted an image in a PDF file can contain a key stating that a par-
(version number is not ‘1’). A PDF/A file must have a corre- ticular interpolation procedure must be used. Nevertheless,
sponding entry in its document information. This entry can this option is rarely used these days and the key is ignored by
also be displayed with Adobe Acrobat. To display it, the user most output devices. Because PDF/A files must appear the
must choose ‘Properties’ from the ‘File’ menu in Adobe Acro- same regardless of the output device used, images in PDF
bat and then click the ‘Additional Metadata...’ button. The files may not have interpolation keys. Remedy: PDF-to-PDF/A
entry appears in the ‘Advanced’ section under http://www. tools such as Preflight or pdfaPilot have correction functions
aiim.org/pdfa/ns/id/. This group must contain a ‘pdfaid:part: that make files containing interpolation keys standard-com-
“1” entry. This entry specifies the version of the PDF/A stan- pliant.
dard. In this PDF file, the entry has a value that is not equal to
‘1’. Remedy: New PDF/A conversion. ■■ Invalid rendering intent: Only the following standard
rendering intents are permitted in PDF/Afiles: Relative Color-
■■ Incorrect PDF/A-1a conformance level (must be “A”): metric, Absolute Colormetric, Perceptual, and Saturation. It
This message only appears when validating PDF/A-1a (and not is very unusual for a different rendering intent to be specified
when validating PDF/A-1b). The PDF/A entry does not have the in a PDF. However, this PDF uses a different rendering intent.
compliance level PDF/A-1a. A PDF/A file must have a corre- Remedy: This is a very unusual error. It can be corrected us-
sponding entry in its document information. This entry can ing pdfaPilot.
also be displayed with Adobe Acrobat. To display it, the user
must choose ‘Properties’ from the ‘File’ menu in Adobe Acro- ■■ Invalid WMode: The stream direction is entered incor-
bat and then click the ‘Additional Metadata...’ button. The en- rectly in this font. The letters and other characters used in
try appears in the ‘Advanced’ section under http://www.aiim. PDF texts require ‘fonts’ that determine their exact appear-
org/pdfa/ns/id/.Thisgroupmustcontaina‚pdfaid:conformance:‘ ance when visualized. In addition to the appearance of the
entry that specifies ‘A’ and declares that the file must comply characters, a font must contain information regarding the
with PDF/A-1a (and not ‘only’ with PDF/A-1b). PDF/A-1a files font ‘stream direction’, since the characters of a font may not
are always compliant with PDF/A-1b. Remedy: Preflight can always be strung together horizontally from left to right as is
correct this entry if converting the file to PDF/A-1a. the case with Latin fonts. For example, some Far Eastern
fonts characters are strung together in a vertical direction
■■ Incorrect PDF/A-1b conformance level (must be “B”): (from top to bottom). This PDF uses a font with incorrect
The PDF/A entry does not have the compliance level PDF/A- stream direction information. It is therefore impossible to
1b. A PDF/A file must have a corresponding entry in its docu- ensure that the PDF will always be visualized in exactly the

74 PDF/A in a Nutshell
What the Preflight error messages mean

same way regardless of the output device. This PDF cannot be and flatten visible layers’ option or the Preflight ‘Merge Lay-
converted to PDF/A. Remedy: This problem can be avoided ers’ option can be used to correct this error.
by using a different font for the text in question.
■■ LZW compression used: Objects used in a PDF are often
■■ JPEG2000 compression used: An image in this docu- compressed to keep the size of the PDF file to a minimum.
ment is compressed in JPEG2000. Images placed in a PDF are Various compression methods are permitted for doing this,
usually compressed to keep the size of the PDF file to a mini- including ZIP, LZW, and JPEG (for images). This PDF file
mum. Various compression methods such as ZIP, LZW, and uses LZW compression. LZW compression is a lossless com-
JPEG can be used to compress image data. Images com- pression method that is patented. It can quite easily be re-
pressed using JPEG2000 can be decompressed in stages dur- placed by ZIP compression, which is also lossless and uses a
ing visualization, meaning that an image can be displayed similar algorithm but is not patent-protected. The PDF/A
even if it is not fully decompressed. However, this process is standard only permits objects that can be visualized without
only supported in more recent PDF versions and must not be restriction, even in the future. This also includes legal restric-
used in PDF/A files. The PDF/A standard does not support tions that might exist because of the LZW patent. For this
objects that were not permitted in the PDF 1.4 specification reason, LZW compression is not permitted in PDF/A files.
that was published by Adobe with Acrobat 5. Remedy: The Remedy: The Acrobat PDF Optimizer contains an option
Acrobat PDF Optimizer can be used to apply JPEG or ZIP that can be used to apply JPEG or ZIP compression (without
compression (without downsampling) to all images. Alterna- downsampling) to all images in the ‘Images’ section.
tively, the file can be saved as a PDF 1.4 file.
■■ Marked entry in MarkInfo missing:  This message only
■■ Keyword mismatch between Document Info and XMP appears when validating PDF/A-1a (and not when validating
metadata: The ‘Keywords’ entry in the XMP document in- PDF/A-1b). The document does not contain any information
formation deviates from the corresponding entry in the doc- on its structure (in the document catalog). The stricter PDF/
ument properties. The PDF/A standard stipulates that docu- A-1a standard stipulates that a PDF file must contain struc-
ment information must exist in the XMP area. If this data is tural information. Remedy: To resolve this problem, the PDF
also contained in the document properties, it must be identi- file must be given the relevant structural information. This
cal to the entries in the XMP area. Remedy: New PDF/A con- information can be added when the PDF is generated. Some
version. PDF export modules have a ‘Tagging’ option or an option
with a similar name. This enables structural information to
■■ Last Modification Date mismatch between Document be transferred into a PDF. It is also possible to add structural
Info and XMP Metadata: The ‘ModifyDate’ entry in the information later on in Adobe Acrobat Professional. Alter-
XMP document information deviates from the correspond- natively, it might be possible to convert the file to PDF/A-1b
ing entry in the document properties. The PDF/A standard instead.
stipulates that document information must exist in the XMP
area. If this data is also contained in the document proper- ■■ Marked entry in MarkInfo not boolean:  This message
ties, it must be identical to the entries in the XMP area. Rem- only appears when validating PDF/A-1a (and not when vali-
edy: The ‘Save As’ command in Acrobat can be used to solve dating PDF/A-1b). The document contains no correctly for-
this problem. matted information on its structure. The stricter PDF/A-1a
standard stipulates that a PDF file must contain structural
■■ Layers used: The file contains layers that can be used to information. Remedy: To resolve this problem, the PDF file
switch the visibility of objects on and off. Layers can be used must be given the relevant structural information. This in-
in PDF files to define that certain page content should only be formation can be added when the PDF is generated. Some
visualized under certain circumstances. Whether or not an PDF export modules have a ‘Tagging’ option or an option
object placed on a layer is visible depends on whether the with a similar name. This enables structural information to
viewer has set the layer in question to ‘visible’ or ‘invisible’. (It be transferred into a PDF. It is also possible to add structural
is also possible for visibility of a layer to be linked to other information later on in Adobe Acrobat Professional. Alter-
factors such as the zoom level with which a PDF is viewed – natively, it might be possible to convert the file to PDF/A-1b
in this case, very small details in a drawing might only be instead.
visible when using a large zoom value.) Because PDF/A stipu-
lates that the visual appearance of a PDF must always be ex- ■■ Marked entry in MarkInfo not set to true:  This message
actly the same, layers cannot be used in PDF/A files. Remedy: only appears when validating PDF/A-1a (and not when validat-
The Acrobat PDF Optimizer ‘Discard hidden layer content ing PDF/A-1b). The entry for structural information is defined

PDF/A in a Nutshell 75
What the Preflight error messages mean

as ‘not available’ in the document. The stricter PDF/A-1a stan- rectly formatted. Remedy: The file must be either converted
dard stipulates that a PDF file must contain structural infor- to PDF/A again or the PDF document must be regenerated.
mation. Remedy: To resolve this problem, the PDF file must be
given the relevant structural information. This information ■■ Metadata entry missing: No document information for
can be added when the PDF is generated. Some PDF export the PDF/Aentry. The PDF/A standard stipulates that document
modules have a ‘Tagging’ option or an option with a similar information must exist in the XMP area. There is no XMP
name. This enables structural information to be transferred document information in this PDF file. Remedy: The ‘Save As’
into a PDF. It is also possible to add structural information command in Acrobat can be used to solve this problem. XMP
later on in Adobe Acrobat Professional. Alternatively, it might metadata is created during the generation of PDF/A.
be possible to convert the file to PDF/A-1b instead.
■■ Metadata not embedded as plain text: The PDF/A stan-
■■ MarkInfo missing:  This message only appears when vali- dard stipulates that document information must exist in the
dating PDF/A-1a (and not when validating PDF/A-1b). The XMP area and must not be compressed. However, the XMP
document does not contain any information on its structure metadata in this PDF is compressed. Remedy: The file must
(in the structure info directory). The stricter PDF/A-1a stan- be either converted to PDF/A again or the PDF document
dard stipulates that a PDF file must contain structural infor- must be regenerated.
mation. Remedy: To resolve this problem, the PDF file must be
given the relevant structural information. This information ■■ More than one encoding in symbolic TrueType font’s
can be added when the PDF is generated. Some PDF export cmap:  A symbol font has more than one allocation table,
modules have a ‘Tagging’ option or an option with a similar which means that characters cannot be uniquely identified.
name. This enables structural information to be transferred The PDF/Astandard stipulates that TrueType font that is also
into a PDF. It is also possible to add structural information a symbol font may only contain a single encoding entry. Oth-
later on in Adobe Acrobat Professional. Alternatively, it might erwise, unique allocation is impossible. Remedy: To resolve
be possible to convert the file to PDF/A-1b instead. this problem, the PDF file must be created anew.

■■ Max. nesting level of graphic states exceeded: The PDF ■■ Named action with a value other than standard page
file contains very deeply nested page objects that can cause navigation used: PDF files can dynamically alter their con-
problems when it is printed out. Every PDF/A file must be a tent during visualization. Actions can be contained in the
basically correct PDF file. The file currently open violates the PDF file for this purpose. The PDF/A standard stipulates that
restrictions of the PDF specification, which limits the degree the visualization of a document must be guaranteed and al-
of nesting for page objects. It is therefore not compliant with ways the same. For this reason, active content is not permit-
the PDF specification and cannot be converted to PDF/A. ted in PDF/A files. The only exception is elements for page
Remedy: The problem can possibly be resolved by opening the navigation. Remedy: This problem can be solved using the Ac-
file in Adobe Acrobat and saving it once again using the ‘Save robat PDF Optimizer or pdfaPilot.
As’ option. Otherwise, it might be possible to clean it up us-
ing the ‘Save Optimized As’ option. ■■ NeedAppearances flag present but not set to false:
Form fields can be filled with variable content. These initially
■■ Max. number of colorants for DeviceN exceeded: An empty fields can have an entry that defines that they must be
object uses too many color channels in a DeviceN object. De- filled with variable content (either through user input or in-
viceN is a multi-channel color space in which spot colors can put that is determined dynamically such as the system envi-
also be used, for example. DeviceN objects can be used in ronment or time). Because the PDF/A standard stipulates that
PDF/A files but the number of channels is restricted to a max- the visualization of a document must always be identical
imum of 8. Remedy: It is extremely unusual for a DeviceN whether it is displayed on a monitor or output on a printer,
color space to use more than 8 channels. Tools such as the form fields in PDF/A documents must not contain this entry.
Acrobat Preflight module can be used to find out which com- Remedy: As long as the content in question is not adversely
ponent uses the color space in question. The error can then be affected, the ‘Flatten form fields’ PDF Optimizer function
corrected. The object in question must be recreated and the can be used to correct this error.
PDF file must then be regenerated.
■■ Number of PDF/A-1 OutputIntent entries > 1: There are
■■ Metadata does not conform to XMP: The PDF/A stan- multiple output intents for PDF/A. This is not compliant with
dard stipulates that document information must exist in the the PDF/A standard. Output intents are used to define the
XMP area. This document has XMP metadata but it is incor- colors used in a PDF in accordance with a specific output

76 PDF/A in a Nutshell
What the Preflight error messages mean

procedure (for example, printing or display on a monitor). tent profile. If a document contains DeviceRGB or DeviceC-
This uniquely defines color specifications. Consequently, only MYK colors, an output intent of the same type must therefore
a single PDF/A output intent may be specified for a PDF/A exist. Remedy: New PDF/A conversion.
file. Remedy: Acrobat 8 Preflight has a correction option that
removes all output intents from a document. The user has to ■■ PDF/A entry missing: No PDF/A entry in the document
create a new profile and choose the option ‘Remove Output information. A PDF/A file must have a corresponding entry
Intent’ from the list of predefined corrections. Since the cor- in its document information. This entry can also be displayed
rection removes all output intents, the document must then with Adobe Acrobat. To display it, the user must choose
be converted to PDF/A again. ‘Properties’ from the ‘File’ menu in Adobe Acrobat and then
click the ‘Additional Metadata...’ button. The entry appears in
■■ OutputConditionIdentifier missing or empty in PDF/A the ‘Advanced’ section under http://www.aiim.org/pdfa/ns/
OutputIntent:  The output intent is incomplete. The output id/. There must be a ‘pdfaid:part: 1’ entry for the PDF/A ver-
condition identifier is missing. Remedy: New PDF/A conver- sion (only version 1 at present) and a ‘pdfaid:conformance:’
sion. entry for the conformity level (PDF/A-1a or PDF/A-1b). The
conformity level must be ‘A’ or ‘B’. PDF/A-1a files are always
■■ Page description contains invalid operator: The PDF compliant with PDF/A-1b. Remedy: The file can be converted
file uses invalid commands for the page description. Every to PDF/A again.
PDF/A file must be a basically correct PDF file. The file that is
currently open uses a command in its page description that is ■■ PDF/A OutputIntent has no destination profile: Because
not defined in the PDF specification. This file is not a valid the PDF/A standard stipulates that colors must appear the
PDF file. It is therefore not possible to convert it to PDF/A. same (as far as is technically possible) regardless of the output
Remedy: The problem can possibly be resolved by opening the device, either a PDF/A document must only contain device-
file in Adobe Acrobat and saving it once again using the ‘Save neutral colors or the color properties of the output device
As’ option. Otherwise, it might be possible to clean it up us- must be defined using an output intent profile. If a document
ing the ‘Save Optimized As’ option. If the problem still exists, contains DeviceRGB or DeviceCMYK colors, an output in-
the PDF file must be regenerated. tent of the same type must therefore exist, along with its des-
tination profile. Remedy: Preflight contains a correction op-
■■ PDF contains data after end of file marker: Every PDF tion that converts the alternate visualization to CMYK
file should have an end of file marker. No further data should (SWOP). This correction must be duplicated and an RGB
follow this marker. In this PDF, there is data after the end of color space such as sRGB must be used as the target. The cor-
file marker. Remedy: The ‘Save As’ command in Acrobat can rection can then be assigned to a profile. The alternate visu-
be used to solve this problem. alization of the spot color can then be modified. pdfaPilot
can also carry out this task.
■■ PDF contains EF (embedded file) entry: The document
contains an entry for an embedded file. In PDF files, other ■■ Producer mismatch between Document Info and XMP
files can be embedded as an attachment in a similar manner metadata: The PDF ‘Producer’ entry in the XMP document
as with an e-mail. The corresponding program is required to information deviates from the corresponding entry in the
view these files (for example, Microsoft Word if a Word file is document properties. The PDF/A standard stipulates that
embedded). Because the PDF/A standard stipulates that it document information must exist in the XMP area. If this
must be possible to visualize all components of a PDF file data is also contained in the document properties, it must be
without the aid of other software, file attachments are not identical to the entries in the XMP area. Remedy: New PDF/A
permitted in PDF/A files. Remedy: The Acrobat PDF Opti- conversion.
mizer contains an option called ‘Discard file attachments’ in
the ‘Discard Objects’ area. This option corrects the error. ■■ Prohibited annotation type: A PDF can contain differ-
ent types of comment. Some of these comment types are in-
■■ PDF/A Destination profile version 4 or newer: The file tended for multimedia content: Sound and movie comment
has an output intent, but the ICC profile used is not compat- types. These types of comment cannot be reproduced by
ible with PDF/A. Because the PDF/A standard stipulates that printers. The FileAttachment comment type allows files in
colors must appear the same (as far as is technically possible) other formats to be embedded into a PDF. Only specialized
regardless of the output device, either a PDF/A document visualization systems are able to render these file attach-
must only contain device-neutral colors or the color proper- ments. These types of comment are not permitted in PDF/A
ties of the output device must be defined using an output in- files as they cannot be visualized on all output devices. In ad-

PDF/A in a Nutshell 77
What the Preflight error messages mean

dition, all types of comment that were not specified in PDF PDF/A files. Remedy: Adobe Acrobat Professional (Version 6,
1.4 (from Adobe Acrobat 5 onwards) are not permitted in 7, or 8) includes a flattener module that can be used to remove
PDF/A since PDF/A is based on PDF 1.4. Remedy: The Acro- transparencies.
bat PDF Optimizer provides an option for removing file at-
tachments in the ‘Discard User Data’ area. ■■ Stream object contains F entry: To be visualized in its
entirety, this PDF requires additional files. Because the PDF/A
■■ RGB used but PDF/A OutputIntent not RGB: Device- standard stipulates that a PDF must be complete and must
dependent color (DeviceRGB) is used but no RGB output in- not require any other information for its visualization, this
tent exists. Because the PDF/A standard stipulates that colors PDF cannot be converted to PDF/A. Remedy: It is unusual for
must appear the same (as far as is technically possible) re- external files to be required for the visualization process. In
gardless of the output device, either a PDF/A document must order to trace the cause, a file that produces this error must
only contain device-neutral colors or the color properties of be checked – with Adobe Preflight, for example – to see which
the output device must be defined using an output intent pro- objects are to blame.
file. If a document contains DeviceRGB or DeviceCMYK col-
ors, an output intent of the same type must therefore exist. ■■ Stream object contains FDecodeParams entry: External
Remedy: New PDF/A conversion. files are required for the visualization of the file in question
(FDecodeParams entry). To be visualized in its entirety, this
■■ RGB used for alt. color but PDF/A OutputIntent not RGB: PDF requires additional files. Because the PDF/A standard stip-
A spot color has been defined in DeviceRGB but the output ulates that a PDF must be complete and must not require any
intent is not defined for RGB. Because the PDF/A standard other information for its visualization, this PDF cannot be con-
stipulates that colors must appear the same (as far as is techni- verted to PDF/A. Remedy: It is unusual for external files to be
cally possible) regardless of the output device, either a PDF/A required for the visualization process. In order to trace the cause,
document must only contain device-neutral colors or the color a file that produces this error must be checked – with Adobe
properties of the output device must be defined using an out- Preflight, for example – to see which objects are to blame.
put intent profile. If a document contains DeviceRGB or De-
viceCMYK colors, an output intent of the same type must ■■ Stream object contains FFilter entry: External files are
therefore exist. Remedy: New PDF/A conversion. required for the visualization of the file in question (FFilter
entry). To be visualized in its entirety, this PDF requires ad-
■■ Scaling factor used: Page contains a zoom factor or ditional files. Because the PDF/A standard stipulates that a
downsizing factor. This PDF defines a change of image scale. PDF must be complete and must not require any other infor-
This particular characteristic was first introduced with Ado- mation for its visualization, this PDF cannot be converted to
be PDF 1.6 (as of Acrobat 7). The PDF/A standard only per- PDF/A. Remedy: It is unusual for external files to be required
mits objects that are compatible with PDF 1.4. A change of for the visualization process. In order to trace the cause, a file
image scale is therefore not permitted in PDF/A files. Reme- that produces this error must be checked – with Adobe Pre-
dy: This problem can be corrected using the Acrobat PDF Op- flight, for example – to see which objects are to blame.
timizer by selecting PDF 1.5 compatibility (Acrobat 6).
■■ Stream size is above 2 GB: A data stream in this file is too
■■ SMask entry present with a value other than None: A large (the maximum permitted size is 2 GB). Extremely large
partially transparent mask is used in this PDF file. Masks hide files may lead to rendering problems when the files in ques-
background objects. They can however be set to ‘transparent’ tion are printed out or displayed on a monitor. The maximum
in PDF files so that objects positioned behind them still remain file size and also the size of internal data objects in PDF/A
partially visible. You can set a percentage value on a scale of 0% files is therefore limited to 2 GB. Remedy: No repair possible.
to 100% to define the extent to which the background of a It may be possible to recreate the PDF file with a more effec-
‘transparent’ object should be visible. The color values of the tive compression type.
foreground mask and background object must be offset against
one another for the reproduction of such constructions when ■■ Subject mismatch between Document Info and XMP
displaying files on a monitor or printing them out. However, metadata: The ‘Subject’ entry in the XMP document informa-
this method of blending is not clearly defined. The PDF/A stan- tion deviates from the corresponding entry in the document
dard stipulates that all features used in a PDF file must be dis- properties. The PDF/A standard stipulates that document in-
played in a single unique way on a monitor or in a printout. formation must exist in the XMP area. If this data is also con-
Because this cannot be ensured in the case of transparent ob- tained in the document properties, it must be identical to the
jects and their backgrounds, transparency is not permitted in entries in the XMP area. Remedy: New PDF/A conversion.

78 PDF/A in a Nutshell
What the Preflight error messages mean

■■ Text cannot be mapped to Unicode:  This message only troduced with Acrobat 7 (PDF 1.6). NChannel is an extension
appears when validating PDF/A-1a (and not when validating of the DeviceN color space, permitted since PDF 1.3. Both
PDF/A-1b). This PDF file contains characters that could not color spaces allow the specification of color values in multi-
be allocated to a Unicode ID since the relevant information is channel color spaces in which spot colors, for example, may
not available in the PDF file. Remedy: In order to resolve this also be used. The PDF/A standard does not support objects
problem, the PDF file must be created anew, using a different that were not permitted in the PDF 1.4 specification that was
font. Alternatively, it might be possible to convert the file to published by Adobe with Acrobat 5. Remedy: Use the Acrobat
PDF/A-1b instead. PDF Optimizer to save the PDF file as a PDF 1.4 version.

■■ Title mismatch between Document Info and XMP ■■ Uses OpenType font: OpenType fonts may not be used in
metadata: The ‘Title’ entry in the XMP document informa- PDF/A files. There are various common font formats (Post-
tion deviates from the corresponding entry in the document Script Type1, Type3 or TrueType). They can be embedded in
properties. The PDF/A standard stipulates that document in- all PDF versions. As of PDF 1.5 (Acrobat 6), OpenType for-
formation must exist in the XMP area. If this data is also con- mat fonts can also be embedded. This font format is used by
tained in the document properties, it must be identical to the one of the fonts embedded in this PDF file. The PDF/A stan-
entries in the XMP area. Remedy: This information can be dard only permits the use of objects that are compatible with
opened using the ‘File’ menu and changed in the general PDF 1.4. Fonts in OpenType format must therefore not be
Document Properties dialog. used or embedded. Remedy: In order to resolve this problem,
the PDF file must be created anew, using a different font.
■■ TR2 entry used with value other than Default: Underly-
ing gradation curve. The color values of a page object (text, ■■ Width information for glyphs incomplete: Width infor-
images, graphics, and so on) can be changed with the aid of mation is missing for some characters in a font used in this
gradation curves (or ‘transfer curves’). On the basis of a document. The letters and other characters used in PDF texts
‘transfer curve’, a new value is determined for every color require ‘fonts’ that determine their exact appearance when
value when a document is displayed on a monitor or printed visualized. The characters used in the text are then depicted
out. This value is then displayed or printed in place of the and arranged in accordance with the representation stored in
‘original value’. Because not all devices can handle transfer the font. The precise position of each character depends upon
curves, they are not permitted in PDF/A files. Remedy: The the tracking of the previous character. The PDF/A standard
Preflight ‘Apply transfer curves’ function corrects this error. stipulates that width information must be available for every
single character used in a document. Remedy: To resolve this
■■ Transparency used: Objects in this PDF file are defined problem, the PDF file must be created anew.
as ‘transparent’. The PDF/A standard stipulates that all fea-
tures used in a PDF file must be displayed in a single unique ■■ Width information for glyphs is inconsistent: Deviating
way on a monitor or in a printout. Because this cannot be specifications exist for character width. The letters and other
ensured in the case of transparent objects and their back- characters used in PDF texts require ‘fonts’ that determine
grounds, transparency is not permitted in PDF/A files. Rem- their exact appearance when visualized. The characters used
edy: Adobe Acrobat Professional (Version 6, 7, or 8) includes in the text are then depicted and arranged in accordance with
a flattener module that can be used to remove transparen- the representation stored in the font. The precise position of
cies. each character depends upon the width of the preceding
symbol. The width specification for any one character is de-
■■ Type 2 CID font: CIDToGIDMap invalid or missing: Not fined both in the font that is embedded in the PDF and in the
all characters (glyphs) can be allocated to this font (CI- PDF itself. The PDF/A standard stipulates that the width
DToGIDMap is missing or incorrect). A font in this PDF does specifications in the embedded font and in the PDF file must
not have a complete allocation table for allocating character be identical. Remedy: To resolve this problem, the PDF file
codes to character representations in the font. In PDF/A files, must be created anew.
the reliable allocation of codes to character representations
must be ensured. Remedy: In order to resolve this problem, ■■ Wrong encoding for non-symbolic TrueType font: This
the PDF file must be created anew, using a different font. font does not use standard character to symbol allocation
(MacRoman or WinAnsi). This PDF can therefore not be
■■ Uses NChannel color: An object in this file uses the converted to PDF/A. Remedy: In order to resolve this prob-
NChannel color space, which is not allowed in PDF/A. The lem, the PDF file must be created anew, using a different
document defines colors in the ‘NChannel’ color space, in- font.

PDF/A in a Nutshell 79
Glossary
Explanation of terms relating to PDF/A
■■ Accessibility: In the digital world, accessibility aims to ■■ Adobe Systems: US software company founded in
ensure that users with impaired vision, restricted neuro- 1982 by John Warnock and Charles Geschke. Warnock
muscular skills, and other disabilities can also take part in and Geschke developed the → PostScript format for print-
the exchange of information. Web pages and other files ing files. The name ‘Adobe’ refers to a type of clay or the
must be designed in a way that provides a clear flow struc- brick made from it, and a river called Adobe Creek flows
ture to help screenreaders reproduce their content cor- near the company’s headquarters. Their well-known prod-
rectly. PDF files can be both accessible and PDF/A-compli- ucts include Photoshop, Illustrator, InDesign, and Acro-
ant at the same time. Accessibility is increasingly regulated bat. PDF was developed by Adobe.
by legislation in the US and Europe.
■■ CCITT Group 4: The ‘Comité Consultatif International
■■ Adobe Acrobat: Program for creating and processing Téléphonique et Télégraphique’ (International Telephone
PDF files. Version 1 was introduced by → Adobe Systems in and Telegraph Consultative Committee) developed this
1993. The current version is Version 8. Acrobat Standard, lossless compression procedure for black and white images
Professional, and Elements offer different features but all (line art) for use when sending faxes.
belong to the Acrobat family. Adobe Professional includes
Adobe → Distiller, which can create PDF documents from ■■ CMYK: This abbreviation stands for Cyan, Magenta,
→ PostScript and EPS data. Yellow, and Key (= black). Different sized dots in these four
colors can be distributed in various ways to realistically
■■ Adobe Reader: Adobe Reader (previously called Acro- depict most color images as well as graphics and text.
bat Reader) is Adobe’s free → PDF viewer. This program However, fluorescent colors and other shades cannot be
runs on various computer and mobile-device platforms displayed well using CMYK. Spot colors are used to dis-
and has been downloaded from the company’s sites mil- play these shades.
lions of times. The free distribution of this program has
contributed to the success of the PDF format. Adobe Read- ■■ Color management: This technology aims to enable
er 8 also allows form files to be saved if the creator of the the uniform display of colors regardless of whether they
documents has enabled the function in Acrobat Profes- appear on a monitor or in proofs, newspaper printouts, or
sional. art printouts. Color profiles (usually → ICC profiles) are
very important for color management – they enable the
device-independent display of colors. Color management
encompasses all production stages from digitalization us-
ing a scanner or digital camera to editing and displaying
the result on a monitor or printing it out.

■■ Comments: Also called annotations. Text comments


enable sophisticated correction workflows with PDF files
such as those required in editorial departments.
→ Adobe Acrobat enables the exchange of comments with
recipients who only have the free → Adobe Reader (current
version) at their disposal. PDF/A supports text annotations
but prohibits comment types such as Sound and Movie.

■■ Compression: Technical procedure for reducing file


size. There are ‘lossy’ compression types such as → JPEG
and ‘lossless’ procedures such as → ZIP. PDF can use com-
Version 8 of the free Adobe Reader can also save form files if the files permit it. pression on page description objects, such as embedded

80 PDF/A in a Nutshell
Glossary

images compressed in JPEG. However, other elements of a ments that used to be printed out on paper in electronic
PDF file that are not components of the page description systems. DMS is an important component of electronic
can also be compressed. As of PDF 1.6, these elements can document archiving systems.
even be merged together into a compressed object (cross-
object compression). ■■ Document scanner: These special devices are for cre-
ating large document sets in the shortest amount of time
■■ Conversion: The term ‘conversion’ refers to changing a possible. Document scanners enable entire batches of doc-
file from one file format to another. uments to be digitalized (both the front and back of pages).
Scanned material is increasingly stored as PDF files. If the
■■ Digital signature: Electronic signatures are important files in question are to be archived, it makes sense to save
in many fields of business and administration. They are them as PDF/A files.
used to identify the originator of a document as well as the
read and usage authorizations of its recipient. Digital sig- ■■ Document properties: The document properties (docu-
natures must be suitable for encryption. They must also be ment info) for PDF files contain four entries: Title, Author,
impossible to falsify. They can be managed and used in Subject, and Keywords. These entries constitute basic
PDF documents using programs such as Acrobat and → metadata information. In → Adobe Acrobat and
→ Adobe Reader. The use of PDF/A with digital signatures → Adobe Reader, document info can be displayed by press-
requires a precisely planned process flow. ing Ctrl+D.

■■ Distiller: The Distiller is an auxiliary program for cre-


ating PDF files from the → PostScript print data format.
The program has been delivered with → Adobe Acrobat
since Version 1. The Distiller enables process flows to be
automated by setting up watched folders.

Document Properties in Acrobat 8 Professional. This area displays information such as the ti-
tle and author of a document.

■■ Font: Character set. A font contains all the letters in an


alphabet as well as digits and sometimes graphical symbols.

The Distiller enables the creation of PDF/A-1b documents. It is not possible to convert files to ■■ Glyph: A glyph is a graphical representation of a char-
PDF/A-1a because the structures required for compliance with the stricter compliance level acter. A character is an abstract concept of a letter or sym-
cannot be adopted or generated. bol. A glyph is the actual graphic used to represent it.

■■ DMS: This abbreviation stands for Document Manage- ■■ ICC profile: ICC profiles are important features of
ment System. It encompasses the management of docu- → color management.. An ICC profile is a data record that

PDF/A in a Nutshell 81
Glossary

describes the color space a device (a monitor, printer, scan- procedure was the predecessor of → J BIG (bi-level im-
ner, or similar device) uses to specify or reproduce colors. ages, black and white files) and → JPEG2000 (improved
ICC stands for International Color Consortium, a group compression).
that consists of manufacturers of graphics, image editing,
and layout programs. ■■ JPEG2000: An image compression standard that, like
→  JPEG, was developed by the Joint Photographic Experts
■■ Image resolution: A digital image consists of pixels Group. JPEG2000 supports both lossless and lossy com-
(image elements). The number of pixels per inch deter- pression. This image file format can include a range of
mines the quality of an image. Because they carry more metadata that facilitates file management and makes it
information, high-resolution images have larger file sizes. easier to find images on the Internet. JPEG2000 is not per-
The usual screen resolution is 72 ppi (pixels per inch – mitted by PDF/A-1a and -1b, but PDF/A-2 will support it.
1 inch is 2.54 centimeters). 300 ppi is often used for print-
ing.

The left-hand image above is rendered in 72 ppi; the right-hand image has a resolution of
300 ppi. Both images have been significantly magnified.

■■ ISO: The International Organization for Standardiza-


tion. This organization compiles international standards
including the PDF/A standard (ISO 19005-1:2005). It was
founded in 1947 in Geneva. More than 150 countries are
now represented within ISO. The standards of the group
are formulated in committees and sub-committees and
are printed and published digitally once complete.

■■ JBIG: JBIG is a standard for the lossless compression of


digital images. Its name is taken from the first letter of
each word in the name of its creators, the Joint Bi-level Im-
age Experts Group. JBIG was specially developed for black
and white images such as faxes.

■■ JPEG: This abbreviation stands for Joint Photo-


graphic Experts Group, the group that developed this
image compression procedure. When it was initially de-
veloped, this procedure was the first to enable the high
image resolutions required for printing while retaining
relatively low data volumes. However, it is a ‘lossy’ com-
pression procedure and is not suitable for black and Acrobat offers a range of different comment types, not all of which are supported by PDF/A.
white images. It offers low, medium, high, and maxi- The illustration above displays comment types in Acrobat.
mum quality levels that users can select when they save
a document. The lower the quality, the smaller the file. ■■ Library: Program library. Unlike programs, a program
PDF/A supports JPEG compression. This compression library does not constitute an independent executable unit.

82 PDF/A in a Nutshell
Glossary

Instead, it is an auxiliary module that contains elements ment of an output intent can modify colors in line with the
required to make programs available. requirements of a different output device. For example,
since Version 6, Adobe Acrobat has displayed a PDF with
■■ LZW: An older, lossless image compression procedure an output intent for offset printing differently to how it
from the 1970s/1980s. It is named after its creators – would display the same PDF with an output intent for
Abraham Lempel, Jacob Ziv, and Terry A. Welch. This newspaper printing.
procedure is not supported by PDF/A because it was sub-
ject to licensing restrictions for a considerable period of ■■ PDF: This abbreviation stands for ‘Portable Document
time. Format’. It is a platform-independent, open file format that
has been developed by → Adobe Systems since 1993. Like a
■■ Metadata: A digital document can have additional container, a PDF document can contain diverse elements:
data on its properties. This information is called metadata. Images, text, sound, movies, 3D objects, form elements,
Metadata provides information on attributes such as the and many more. The functional scope of PDF is constantly
author of a file and the title of a document. It enables docu- being enhanced. The current version is PDF specification
ments to be categorized by keywords and supports the ad- 1.7, which was introduced with → Adobe Acrobat 8.
dition of copyright information. Adobe products use mod-
ern → X MP metadata. ■■ PDF viewer: Program for displaying PDF documents.
In addition to the → Adobe Reader, such programs include
the ‘Preview’ program that belongs to the current version
of Apple’s operating system. There are both free PDF view-
ers for various platforms and viewers that must be pur-
chased.

■■ PDF layer: PDF files can contain layers. A better term is


‘Optional Content Group’/OCG, since PDF layers are not
layers like those in Photoshop – instead, they enable con-
tent to be made visible or invisible depending on the con-
text defined for a file. PDF layers/OCGs are not permitted
in PDF/A files because they enable different representa-
tions of the same file.

Metadata in a PDF file, displayed in Acrobat Professional.

■■ OCR: This abbreviation stands for Optical Character


Recognition. It is an optical text recognition procedure.
OCR allows the pixel-based data of scanned documents to
be assigned searchable text later on.

■■ Output intent: An output condition or output intent is


a component of color management. The output intent can
be used to specify the intended output method for a (PDF)
file. This is normally realized using an ICC output profile.
For example, whereas a PDF file intended for offset print-
ing might user the ‘ISO Coated’ output intent, a file for Layers in PDF: PDF layers (OCGs) can be used to create language layers in PDF files, for
display on a monitor is better suited to → sRGB. The assign- example.

PDF/A in a Nutshell 83
Glossary

■■ PDF version: PDF is constantly being developed. With ■■ Preflight: This plug-in, which is delivered with Acro-
each new Acrobat version, Adobe publishes a new PDF bat, is a tool for checking PDF files. It is developed by the
specification. The document containing the specification Berlin-based company callas software. As of Acrobat 8,
is called a ‘PDF Reference’. PDF 1.7 has been available since Preflight can carry out corrections as well as checking
the rollout of Acrobat 8 (tip: to determine the correspond- PDFs. Acrobat also use Preflight to carry out all of its
ing Acrobat version, add one to the PDF version number PDF/A validations and conversions. In addition to using
– for example, PDF 1.3 belongs to Acrobat 4). the validation and verification profiles delivered with Pre-
flight, users can also create their own profiles.
■■ PDF/A: Standard developed by → ISO, the International
Organization for Standardization, especially for the long- ■■ RGB: This color space consists of the primary colors
term archiving of PDF files. The PDF/A-1 standard was red, green, and blue and is used for displaying documents
adopted under the name ISO 19005-1:2005 in 2006. This on color monitors. The additive color model has 255 grades
first version used the PDF 1.4 specification to define the for these three basic colors. White is created if all three
elements that are permitted in PDF/A files. PDF compo- components have the value 255; black is formed if they all
nents that were only introduced in later versions of the have the value 0.
PDF specification are therefore prohibited in PDF/A files.
Such components must be modified or removed. The ■■ sRGB: sRGB (standard RGB) color space. It was mutu-
PDF/A-2 standard, which is already being compiled, will ally developed in 1996 by Hewlett-Packard and Micro-
be based on a more recent PDF specification. soft.

■■ PDF/X: Standard developed by → ISO for the prepress


field. PDF/X is defined in ISO standards 15929 and 15930.
This standard enables the reliable reproduction of PDF
print files without the need for protracted talks before-
hand. The PDF/X-1a and PDF/X-3 are already prevalent.
PDF/X-1 is intended only for use with → CMYK (and pos-
sibly spot colors), whereas PDF/X-3 also permits profiled
→ RGB.

■■ PDFMaker: The Adobe PDFMaker is a macro installed


The image on the left shows a schematic depiction of the sRGB color space; the image on the
with → Adobe Acrobat. Among other things, it enables the right displays the RGB color space.
creation of PDF files from Word. The PDFMaker works
with the Acrobat → Distiller to create PDF files, meaning ■■ Tagged PDF: Structured PDF. The content structure of
that it has access to all PDF settings for the Adobe pro- a PDF/A file must be specified using tagged PDF. Tagged
gram. PDF is also a prerequisite for accessible PDF files. The cor-
responding structures can be created in the source docu-
■■ Plug-in: A plug-in is an additional module for a main ment (for example, in InDesign) or can be added later on
program. These additional modules are often marketed by in the PDF file. Unlike PDF/A-1b files, PDF/A-1a files must
third-party suppliers. → Adobe Acrobat can be enhanced contain structural information (tagging).
by many additional functions with plug-ins.
■■ Tags: Tags help to produce → tagged PDF. For example,
■■ ppi: Unit of measure for image resolution. The abbre- an image can be given the tag ‘Figure’ and an alternate im-
viation stands for ‘pixels per inch’. One inch is equal to age description can be added for accessibility reasons.
2.54 centimeters.
■■ TIFF: The file format TIFF (Tagged Image File Format)
■■ PostScript: PostScript is a page description language was developed by Aldus (taken over by Adobe in 1984)
that has been developed by the US company → Adobe Systems and Microsoft for scanned raster graphics for color sepa-
since 1984. It converts pages into PostScript format in or- ration. TIFF can contain layers. → JPEG or → ZIP com-
der to output them on different output devices in any size pression can be used to reduce the file size. The variant
and resolution without loss. The functional scope of the for black and white images, → TIFF G4, has been an im-
format has been enhanced twice, and the current version is portant format for archiving scanned documents for a
PostScript Level 3 (available since 1998). long period of time.

84 PDF/A in a Nutshell
Glossary

Not yet validated: The Acrobat Preflight tool can be used to check the validity of PDF/A docu-
ments.

■■ XML: Extensible Markup Language (XML) can be used


to structure document content.

Tags in Acrobat Professional: All elements of a tagged PDF file are given marks that clearly
assign them to a content type and style. Tags also control their sequence.

■■ TIFF G4: TIFF G4 is a monochrome type of TIFF that is


compressed using the → CCITT Group 4 procedure. This
type of TIFF file combines high readability of text docu- XML can be used to store structured information in a kind of tree structure.
ments with relatively small document size. This is precisely
what is required for archiving purposes. ■■ XMP: This abbreviation stands for Extensible Metadata
Platform. XMP is used to integrate → metadata into Adobe
■■ Transparency: Transparent objects can occur in PDF programs in a uniform manner.
files. If the opacity of an element is less than 100 percent,
the background is visible through the element in question.
Transparencies have been supported in PDF documents
since PDF 1.4 (Acrobat 5). Transparencies are not permit-
ted by the PDF/A standard.

■■ Unicode: International industry standard promoted by


the Unicode Consortium since 1991. It aims to define a
digital code for every single character in all known writing
systems/character systems. It has different formats, but
UTF-8 (Unicode Transformation Format) is the most XMP – developed by Adobe Systems – is used to integrate metadata into PDF/A.
common both on the Internet and in common operating
systems. ■■ XPS: This abbreviation stands for XML Paper Specifica-
tion, a Microsoft document format.
■■ Validation: From the Latin ‘validus’ (‘strong’). A vali-
dation is the checking of an hypothesis. It concludes in ■■ ZIP: The ZIP format is an open format for compressing
verification (‘true’), falsification (‘not true’), or in no result. files. ZIP compression is ‘lossless’ and works well for im-
In the context of PDF/A, validation means checking a file ages that contain large areas in one color or in repeating
that claims to be PDF/A-compliant to see whether this is patterns. The use of ZIP compression is permitted for
actually the case. PDF/A documents.  n

PDF/A in a Nutshell 85
About: The PDF/A Competence Center
Association for Digital Document Standards – ADDS
The PDF/A Competence Center is an initia- conducting events, working on further
tive of the Association for Digital Docu- standardizations and serving as a central
ment Standards (ADDS) e.V., founded in competent point of contact for answering
September 2006. A particularly important all questions about PDF/A.
aim of the association is to promote the ex-
change of information and experience in Work on the ISO Standard
the area of long-term archiving in accor- Several members of the PDF/A Compe-
dance with ISO 19005 (PDF/A). tence Center are technically oriented and
actively participate in the further develop-
ment of the PDF/A standard as members of
PDF/A the responsible ISO committee (ISO TC
171 – Document management applica-
Competence Center tions).
Member companies test each others
products for compliance with the ISO stan-
dard and compatibility in order to guaran-
The new ISO standard for long-term ar- tee a high level of quality. It is planned to
chiving, PDF/A, is generating considerable also offer test suites and compliance checks
interest in the market. In order to encour- for products from other suppliers. This
age the high demand for information and happens in the context of the Technical
exchange of ideas concerning PDF/A, callas Working Group (TWG).
software GmbH, Compart Systemhaus
GmbH, LuraTech Europe GmbH, PDF Events around the PDF/A Standard
Tools AG and PDFlib GmbH have founded In order to meet the high informational
the PDF/A Competence Center. needs around PDF/A in the market, the
The executive chairman is Thomas Zell- PDF/A Competence Center organizes sem-
mann, a managing partner of LuraTech. inars and events in different locations on a
Dr. Hans Baerfuss, CEO of PDF Tools AG, regular basis.
Switzerland, is the executive vice-chair-
man.
The association is geared towards devel-
opers of PDF solutions, companies that
work with PDF/A in the area of DMS/ECM,
interested individuals, and also users who
want to implement PDF/A in their organi-
zations. Although the months directly after
the founding saw new members predomi-
nantly from German speaking regions, the
executive committee has expanded their
activities internationally beginning in
2007.
Interested parties can thus benefit from
the combined knowledge of competent For details about current activities,
PDF/A suppliers. The newly founded asso- please check the Events page at pdfa.org on
ciation offers numerous services including the Internet.  n

86 PDF/A in a Nutshell
AIIM
The Enterprise Content Management Association
AIIM is the international authority on En- and M-iD (Managing Information and
terprise Content Management (ECM). Documents Magazine) – the leading indus-
ECM is the technologies used to capture, try print publications in North America
manage, store, preserve, and deliver con- and the UK; and our online Solution Cen-
tent and documents related to organiza- ters for financial services, healthcare, and
tional processes. ECM tools and technolo- state & local government.
gies provide solutions to help users with
the four C’s of business: Continuity, Col- ■■ Professional Development: AIIM’s in-
laboration, Compliance, and Costs. dustry education road map offers business
and government professionals a variety of
training opportunities. Our ECM & ERM
Certificate Programs provide instruction
on the Why?, What?, and How? of Enter-
prise Content Management and Electronic
Records Management via Web-based and/
or classroom courses.

■■ Peer Networking: Through chapters,


For over 60 years, AIIM has been the lead- networking groups, programs, partner-
ing non-profit organization focused on help- ships, and the Web, AIIM creates opportu-
ing users to understand the challenges asso- nities that allow, users, suppliers, consul-
ciated with managing documents, content, tants, and the channel to engage and
records, and business processes. Today, connect with one another.
AIIM is international in scope, independent,
implementation-focused, and, as the repre- ■■ Industry Advocacy: As an ANSI
sentative of the entire ECM industry – in- (American National Standards Institute)
cluding users, suppliers, and the channel – accredited standards development organi-
acts as the industry’s intermediary. zation, AIIM acts as the voice of the ECM
As a neutral and unbiased source of in- industry in key standards organizations,
formation, AIIM serves the needs of its with the media, and with government deci-
members and the industry by providing sion makers. Our Industry Watch research
educational opportunities, professional de- reports provide intelligent information
velopment, reference and knowledge re- about user trends and perceptions.  n
sources, networking events, and industry
advocacy. Information about AIIM can be
found at www.aiim.org.

AIIM provides:
■■ Market Education: AIIM provides un-
biased information through its ECM Solu-
tions Seminar (held throughout the U.S.
and Canada); the Managing Information
and Documents Road Show (held through-
out the UK); InfoIreland (held in Dublin);
AIIM Webinars; AIIM E-DOC Magazine

PDF/A in a Nutshell 87
PDF/A in a Nutshell – Long Term Archiving with PDF

The authors:
Olaf Drümmer:
PDF/A is the PDF for long-term archiving. Olaf Drümmer is the co-author of
PDF/A – which was adopted at the end of 2005 – is the ‘Postscript- und PDF-Bibel’, and has
first file format which, since it is an ISO standard, played a crucial role in the standard-
guarantees that documents created today will also be able ization of PDF/X (since 1999) and
to be opened and used in the future. ‘PDF/A in a Nutshell’ PDF/A (since 2002). He is a member
allows the user to take a look behind the scenes of the of several international institutions
standard and provides practical instructions on generat- and associations: DIN, ECI, Ghent
ing PDF/A that conforms with the stipulations of the PDF Workgroup, PDF/A Competence Center, and PDF/X-
standard in his or her working environment. This book ready.
also serves as a comprehensive introduction to a subject Olaf Drümmer is CEO of callas software GmbH. callas
software develops the Preflight functions integrated in
matter that is still very new as well as providing practical
Acrobat since Version 6 (2003).
examples for different software tools and industry solu-
tions able to generate and work with PDF/A.
Alexandra Oettler:
Alexandra Oettler is a technical
writer who has worked as a freelance
Extracts from the content of the book: journalist specializing in software for
■ Why PDF/A? many years. She regularly has articles
on DTP software in practice pub-
■ The PDF/A-1a and PDF/A-1b conformity levels lished in specialist prepress journals.
■ PDF/A with Acrobat 8 Professional She also writes user manuals for PDF
and prepress programs and teaches software training
■ Archive PDFs from Microsoft Office 2003 and 2007 courses. As the editor in chief of the pdfnews.de Web site,
she provided German-speaking readers with daily infor-
■ Scanning documents to create PDF/A and applying
mation on new products, schedules, and tips between 2001
text recognition
and 2004.
■ High-volume PDF/A creation
■ Validating PDF/A Dietrich von Seggern:
After completing his university stud-
■ Accessible PDF/A documents ies in print technology, Dietrich von
■ Future-proof contracts Seggern worked as a prepress man-
ager. He worked on research projects
■ Forms in PDF/A related to the transmission of digital
print data. Later, he became the
■ Fonts and images in PDF/A
manager of the digital advertisement
■ Reliable colors on monitors and when printing transmission department at the mar-
keting organization of the German newspaper publishers
(ZMG). He has been working as the head of Product
Management at callas software GmbH in Berlin for several
years now.

ISBN 978-3-9811648-1-7

PDF/A
Competence Center
9 7 8 3 9 81 16 4 817

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy