Adobe's Portable Document Format can contain two types of metadata formats. The first is the Document Information Dictionary, a set of fields such as author, title, subject, creation and update dates which have been a part of the PDF file for (almost) ever. When you view the document properties in Adobe Reader, these are the metadata fields you’re looking at:
Programs such as Microsoft Word, Adobe InDesign and Photoshop allow users to embed metadata when saving or exporting to the PDF format. But what about after the file is created? It should be noted that PDFs are "read only" by default, and as such, you won't be able to edit this metadata unless you have access to a commercial editor, like Adobe Acrobat. And what about making changes to large batches?
Like the other formats, there are plenty of tools and workarounds (both free and not-so-free) that can help you with embedding PDF metadata. But as you go deeper into a metadata k-hole, you may find some degree of satisfaction in leveraging the Do-It-Yourself ethic by… uh, doing it yourself.
At our library, we use a great (and free!) command-line tool for reading and manually changing PDF file metadata called Exiftool. ExifTool is platform-independent, so it'll work on a range of operating systems -- we've used it on projects using both Mac and Windows platforms and the environment is (more or less) comparable.
Exiftool will be scary to some because, as we said, it's a command-line tool. But if you know what you’re doing -- and relative command-line skills aside, Exiftool is pretty easy to use -- you could take metadata editing into your damn own hands, either for one file or many. (If you want to jump in head first try University of Surrey's UNIX Tutorial For Beginners.)
Let's say, for example, you wanted to add a name and a title to a PDF that's on a PC. If you have Exiftool already installed, you could open a command window in the folder containing the PDF and type in the following command:
exiftool -Title="Proposal" -Author="Indie Preserves" -overwrite_original Proposal.pdf
We use this process on hundreds of PDFs at a time before PDFs are ingested into our institutional repository. We start with a spreadsheet that automatically copies over file names and descriptive information from another sheet, creating individual Exiftool commands:
PDFs aren't the only format Exiftool works -- it will read metadata from over a hundred different file types. (Writing metadata on these files is a different story; you'll want to check out the supported types to see what you can do.)
Beyond PDFs, Exiftool was born to run on image files, which means you can read and edit formats such as TIFF, Jpeg, PSD, PNG, Panasonic RAW, and many others. AVPreserve has a great blog post on analyzing embedded image metadata using Exiftool. For those of you not ready for command line prime-time, two different GUI apps (ExifToolGUI and pyExiftoolGUI) exist to shepherd you through the editing process (though you will still need to have Exiftool downloaded somewhere on your computer).
Metadata enrichment takes a bit of planning and practice, but whatever your technical level, you have the power to make your files as detailed as you want them to be. Go get 'em, tigers!
Hat-tip to David Riecks for the AVPreserve tutorial -- be sure to visit his site Photometadata.org!