Open PDF File

Information, tips and instructions

PDF File Format

In this article we will cover some basics of PDF format enough to understand concepts of PDF files inner workings and how they impact PDF file viewing on different devices.

If you open PDF file with Hex Editor (good tools to do it would be Sublime Text or Notepad++) you will find that parts of PDF file are text based while parts are coded as binary. Binary parts typically are either raster images or compressed parts of PDF representation. It is possible to decompress them using “qpdf” tool available from http://qpdf.sourceforge.net/. To decompress you need to type following in the command line:

qpdf --stream-data=uncompress input_file.pdf output_file.pdf.

This will convert PDF file internal compressed structures into decompressed form so you will be able to read an entire PDF file in HEX editor.

Each page in PDF is rendered based on two-dimensional device-independent coordinates system. Each component within a page has coordinates which define its position within a page. Components have following properties:

  • transformation matrix which define how to scale components relative to the page
  • transparency alpha constant which defines how transparent component should be
  • clipping path specifying which parts of the component should be clipped during rendering
  • color space defining which colors to use during rendering

Text in PDF file is rendered by specifying its position and font. Fonts could be either embedded in PDF file or selected from the 14 fonts which must be present in all PDF readers following the specification.

BT
/F13 12 TF
288 720 Td
(ABC) Tj
ET

In the example above /F13 12 TF means that font #13 (Helvetica) with size 12 should be used to display text. 288 720 Td specifies coordinates where the text should be displayed. (ABC) Tj specifies that ABC should be displayed with the properties defined above.

PDF file can also include interactive elements which are specified using AcroForms or Adobe XML Forms Architecture formats. PDF file format ISO 32000-1:2008 specification could be found in https://www.adobe.com/content/dam/acom/en/devnet/pdf/pdfs/PDF32000_2008.pdf file.

 

Related File Types
Contact Us

PDF Quick Info
Portable Document Format
MIME Types
  • application/pdf
  • application/x-pdf
  • text/x-pdf
Identifying Characters
Hex: 25 50 44 46 2D 31 2E 
ASCII: %PDF-1.
PDF File Opens with
  • Acrobat Reader
  • Sumatra PDF
  • Foxit Reader