KIBO Bless BOZO - Knowledge In, Bullshit Out.


The Adobe Portable Document Format (PDF, specification v. 1.3) defines the root of a document's object hierarchy as the "catalog dictionary". The catalog contains the necessary references to objects and data that compose the document contents and their attributes. Also, it contains directives to define how the document should be rendered (that is, displayed to the user).

It can hold different key entries: Type, Pages, PageLabels, Names, Dests, ViewerPreferences, PageLayout, PageMode, Outlines, Threads, OpenAction, URI, AcroForm, StructTreeRoot and SpiderInfo.

The current specification is affected by a design flaw: a rogue Pages setting or malicious catalog dictionary will lead to unexpected conditions. This is apparently not contemplated, and it's assumed that the PDF will contain valid references to it's page tree node and other objects. Thus, when an invalid page tree node or object is referenced, the application behavior is undefined. Potential conditions include, but aren't limited to: memory corruption (dereferencing invalid pointers, stack overflow/recursion, heap-based overflow), memory leaks and denial of service (ex. infinite loop on page tree parsing).

1 0 obj
<< /Type /Catalog
/Pages 2 0 R
/Outlines 3 0 R
/PageMode /UseOutlines

The following description of the Page Tree node is provided in the PDF format specification:

The pages of a document are accessed through a structure known as the page tree, which defines their ordering within the document. The tree structure allows PDF viewer applications to quickly open a document containing thousands of pages using only limited memory. The tree contains nodes of two types—intermediate nodes, called page tree nodes, and leaf nodes, called page objects—whose form is described in the sections below. Viewer applications should be prepared to handle any form of tree structure built of such nodes. The simplest structure would consist of a single page tree node that references all of the document’s page objects directly; however, to optimize the performance of viewer applications, the Acrobat Distiller and PDF Writer programs construct trees of a particular form, known as balanced trees. Further information on this form of tree can be found in Data Structures and Algorithms, by Aho, Hopcroft, and Ullman (see the Bibliography).

Note that memory footprint is one of the reasons of how the page tree is handled. At this point, it's worth noting that some ( notable) vulnerabilities in the past have been related strictly to inconsistencies in the design of a particular file format. For example, TIFF, which became popular during the " TIFF Fuck-Up" due to the amount of security flaws found in different implementations. TIFF is flawed by design.

Affected versions

This issue has been verified in different PDF reading applications:

- Apple Mac OS X 3.0.8 (409)
- Adobe Acrobat Reader 7.0 - 5.0 and previous.
  - 8.0.0 is not affected apparently.
  - GNU/Linux
  - Microsoft Windows
  - Mac OS X
- xpdf 3.0.1 (patch 2)
  - Note: Affects software based on it's source as well
    (gv, kpdf, poppler, etc).

Note: other applications not listed here might be affected by this issue, due to the fact that this is a design flaw and not a particular vulnerability for a concrete application. Feel free to contact me if you have found another application and/or version affected.

Proof of concept, exploit or instructions to reproduce

The provided PDF proof of concept can be used to demonstrate this issue with a rogue Pages entry defined in the catalog dictionary. Any existing PDF can be modified to reproduce the issue on the different affected applications.

$ open MOAB-06-01-2007.pdf

The effects might be different depending on the application, please read the "Exploitation conditions" section for more information.

Debugging information

The following debugging information corresponds to the results of opening the malicious PDF in Apple (for Mac OS X, version 3.0.8):

Attaching to program: `/Applications/', process 4438.
Reading symbols for shared libraries .................. done
0x90009857 in mach_msg_trap ()
(gdb) c
Program received signal SIGINT, Interrupt.
0x90469c5e in is_page_tree_node ()
(gdb) info registers 
eax            0x1      1
ecx            0xbfffeeec       -1073746196
edx            0x31a559 3253593
ebx            0x90469c30       -1874420688
esp            0xbfffee80       0xbfffee80
ebp            0xbfffef08       0xbfffef08
esi            0x0      0
edi            0x1      1
eip            0x90469c5e       0x90469c5e <is_page_tree_node+60>
(gdb) back
#0  0x90469c5e in is_page_tree_node ()
#1  0x90469b8b in CGPDFReaderGetPageDictionary ()
#2  0x9046994d in CGPDFPageCreate ()
#3  0x9041d15e in CGPDFDocumentGetPage ()
#4  0x95ab1606 in -[PDFPage initWithDocument:index:] ()
#5  0x95ab1397 in -[PDFDocument setPDFRef:] ()
#6  0x95ab10cf in -[PDFDocument initWithURL:] ()
#7  0x0001979c in ?? ()
#8  0x0001961c in ?? ()
#9  0x9347cfe8 in -[NSDocument readFromURL:ofType:error:] ()
#10 0x9347c6bd in -[NSDocument initWithContentsOfURL:ofType:error:] ()
#11 0x000046c9 in ?? ()
#12 0x00003a05 in ?? ()
#13 0x9335bd88 in -[NSApplication sendAction:to:from:] ()
#14 0x93409ce7 in -[NSMenu performActionForItemAtIndex:] ()
#15 0x93409a29 in -[NSCarbonMenuImpl performActionWithHighlightingForItemAtIndex:] ()
#16 0x9333ae16 in _NSHandleCarbonMenuEvent ()
#17 0x9326e7fc in _DPSNextEvent ()
#18 0x9326e056 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] ()
#19 0x93267ddb in -[NSApplication run] ()
#20 0x9325bd2f in NSApplicationMain ()
#21 0x0003a152 in ?? ()
#22 0x0003a079 in ?? ()

(gdb) x/10x $ecx
0xbfffeeec:     0x0031a559      0x01800000      0x0000002c      0xbfffef78
0xbfffeefc:     0x90469a79      0x00000000      0x00000001      0xbfffef68
0xbfffef0c:     0x90469b8b      0x00399d20

(gdb) printf "%s\n", 0x0031a559
(gdb) x/10s 0x0031a559
0x31a559:        "Pages"
0x31a55f:        "c5Kids"
0x31a566:        "df\a\001"
0x31a56b:        ""
0x31a56c:        " ?9"
0x31a570:        "\025Count"
0x31a577:        ""
0x31a578:        "??1"
0x31a57c:        "\001"
0x31a57e:        ""

(gdb) x/i $ebx
0x90469c30 <is_page_tree_node+14>:      mov    %eax,-44(%ebp)
(gdb) x/x $ebp-44
0xbfffeedc:     0x003e23d0
(gdb) x/x 0x003e23d0
0x3e23d0:       0x00000001

(gdb) x/4i $eip
0x90469c5e <is_page_tree_node+60>:      lea    2870984(%ebx),%ecx
0x90469c64 <is_page_tree_node+66>:      mov    %ecx,%edi
0x90469c66 <is_page_tree_node+68>:      mov    %edx,%esi
0x90469c68 <is_page_tree_node+70>:      mov    $0x6,%ecx

See "Exploitation conditions" for examples of other affected software and other conditions caused by rogue catalog entries.


Exploitation conditions

Due to the nature of this issue, many different conditions take place, depending on the abused application and platform. The following debugging session shows xpdf's pdfinfo running on a Fedora Core 5 installation, x86_64, after reading the malicious PDF file:

[lmh@lab05 ~]$ gdb /usr/bin/pdfinfo 
GNU gdb Red Hat Linux (
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu"...
Using host libthread_db library "/lib64/".

(gdb) r -v
Starting program: /usr/bin/pdfinfo -v
pdfinfo version 3.00
Copyright 1996-2004 Glyph & Cog, LLC

Program exited with code 0143.
(gdb) r MOAB-06-01-2007.pdf
Starting program: /usr/bin/pdfinfo MOAB-06-01-2007.pdf

Program received signal SIGSEGV, Segmentation fault.
0x00000032b006eb78 in _int_malloc () from /lib64/

(gdb) back
#0  0x00000032b006eb78 in _int_malloc () from /lib64/
#1  0x00000032b007089d in malloc () from /lib64/
#2  0x00002aaaaaba0ea5 in gmalloc (size=Variable "size" is not available.
) at gmem.c:92
#3  0x00002aaaaaba0eff in copyString (s=0x52f6b4 "[") at gmem.c:227
#4  0x00002aaaaab6ea15 in Lexer::getObj (this=0x52f690, obj=0x4f87c20, 
    objNum=-1) at Object.h:102
#5  0x00002aaaaab77e90 in Parser::getObj (this=0x4f87c00, obj=0x7fffe094f220, 
    fileKey=0x0, keyLength=0, objNum=3, objGen=0) at
#6  0x00002aaaaab78210 in Parser::getObj (this=0x4f87c00, obj=0x7fffe094f330, 
    fileKey=0x0, keyLength=0, objNum=3, objGen=0) at
#7  0x00002aaaaab82d60 in XRef::fetch (this=0x528530, num=3, gen=0, 
    obj=0x7fffe094f330) at
#8  0x00002aaaaab22a6d in Catalog::readPageTree
    (this=0x529ad0, pagesDict=Variable "pagesDict" is not available.)
    at Object.h:228
#9  0x00002aaaaab22aa5 in Catalog::readPageTree
    (this=0x529ad0, pagesDict=Variable "pagesDict" is not available.)
#10 0x00002aaaaab22aa5 in Catalog::readPageTree
    (this=0x529ad0, pagesDict=Variable "pagesDict" is not available.)

(gdb) list gmem.c:227
222     #endif
224     char *copyString(char *s) {
225       char *s1;
227       s1 = (char *)gmalloc(strlen(s) + 1);
228       strcpy(s1, s);
229       return s1;
230     }

(gdb) list
260 ;
261           ++start;
262         // This should really be isDict("Pages"), but I've seen at least one
263         // PDF file where the /Type entry is missing.
264         } else if (kid.isDict()) {
265           if ((start = readPageTree(kid.getDict(), attrs1, start))
266               < 0)
267             goto err2;
268         } else {
269           error(-1, "Kid object (page %d) is wrong type (%s)",

(gdb) select-frame 3
(gdb) x/3i $rip
0x2aaaaaba0eff <copyString+31>: mov    %r12,%rsi
0x2aaaaaba0f02 <copyString+34>: mov    %rax,%rbx
0x2aaaaaba0f05 <copyString+37>: mov    %rax,%rdi
(gdb) x/1s $r12
0x52f6b4:        "R"

Actually, exploitation of this issue for arbitrary code execution is possible. Again, this depends on the application and the condition caused by the rogue catalog / page tree node.

Workaround or temporary solution

Don't open untrusted PDF files, don't use any plug-in or browser extension that allows remote PDF files to be loaded automatically. A temporal solution might be using Adobe Acrobat Reader 8.0.0 but it may be affected by other issues as well.