DartConnect

DartConnect

After investigating the security of PDF signatures, we had a deeper look at PDF encryption. In cooperation with our friends from Münster University of Applied Sciences, we discovered severe weaknesses in the PDF encryption standard which lead to full plaintext exfiltration in an active-attacker scenario.

To guarantee confidentiality, PDF files can be encrypted. This enables the secure transfer and storing of sensitive documents without any further protection mechanisms.

The key management between the sender and recipient may be password based (the recipient must know the password used by the sender, or it must be transferred to them through a secure channel) or public key based (i.e., the sender knows the X.509 certificate of the recipient).

In this research, we analyze the security of encrypted PDF files and show how an attacker can exfiltrate the content without having the corresponding keys.

So what is the problem?

The security problems known as PDFex discovered by our research can be summarized as follows:

Even without knowing the corresponding password, the attacker possessing an encrypted PDF file can manipulate parts of it.
More precisely, the PDF specification allows the mixing of ciphertexts with plaintexts. In combination with further PDF features which allow the loading of external resources via HTTP, the attacker can run direct exfiltration attacks once a victim opens the file.
PDF encryption uses the Cipher Block Chaining (CBC) encryption mode with no integrity checks, which implies ciphertext malleability.
This allows us to create self-exfiltrating ciphertext parts using CBC malleability gadgets. We use this technique not only to modify existing plaintext but to construct entirely new encrypted objects.

Who uses PDF Encryption?

PDF encryption is widely used. Prominent companies like Canon and Samsung apply PDF encryption in document scanners to protect sensitive information.

Further providers like IBM offer PDF encryption services for PDF documents and other data (e.g., confidential images) by wrapping them into PDF. PDF encryption is also supported in different medical products to transfer health records, for example Innoport, Ricoh, Rimage.

Due to the shortcomings regarding the deployment and usability of S/MIME and OpenPGP email encryption, some organizations use special gateways to automatically encrypt email messages as encrypted PDF attachments, for example CipherMail, Encryptomatic, NoSpamProxy. The password to decrypt these PDFs can be transmitted over a second channel, such as a text message (i.e., SMS).

Technical details of the attacks

We developed two different attack classes on PDF Encryption: Direct Exfiltration and CBC Gadgets.

Attack 1: Direct Exfiltration (Attack A)

The idea of this attack is to abuse the partial encryption feature by modifying an encrypted PDF file. As soon as the file is opened and decrypted by the victim sensitive content is sent to the attacker. Encrpyted PDF files does not have integrity protection. Thus, an attacker can modify the structure of encrypted PDF documents, add unencrypted objects, or wrap encrypted parts into a context controlled the attacker.

In the given example, the attacker abuses the flexibility of the PDF encryption standard to define certain objects as unencrypted. The attacker modifies the Encrypt dictionary (6 0 obj) in a way that the document is partially encrypted – all streams are left AES256 encrypted while strings are defined as unencrypted by setting the Identity filter. Thus, the attacker can freely modify strings in the document and add additional objects containing unencrypted strings.

The content to be exfiltrated is left encrypted, see Contents (4 0 obj) and EmbeddedFile (5 0 obj). The most relevant object for the attack is the definition of an Action, which can submit a form, invoke a URL, or execute JavaScript. The Action references the encrypted parts as content to be included in requests and can thereby be used to exfiltrate their plaintext to an arbitrary URL. The execution of the Action can be triggered automatically once the PDF file is opened (after the decryption) or via user interaction, for example, by clicking within the document.

This attack has three requirements to be successful. While all requirements are PDF standard compliant, they have not necessarily been implemented by every PDF application:

Partial encryption: Partially encrypted documents based on Crypt Filters like the Identity filter or based on other less supported methods like the None encryption algorithm.
Cross-object references: It must be possible to reference and access encrypted string or stream objects from unencrypted attacker-controlled parts of the PDF document.
Exfiltration channel: One of the interactive features allowing the PDF reader to communicate via Internet must exist, with or without user interaction. Such Features are PDF Forms, Hyperlinks, or JavaScript.

Please note that the attack does not abuse any cryptographic issues, so that there are no requirements to the underlying encryption algorithm (e.g., AES) or the encryption mode (e.g., CBC).

In the following, we show three techniques how an attack can exfiltrate the content.

Exfiltration via PDF Forms (A1)

The PDF standard allows a document's encrypted streams or strings to be defined as values of a PDF form to be submitted to an external server. This can be done by referencing their object numbers as the values of the form fields within the Catalog object, as shown in the example on the left side. The value of the PDF form points to the encrypted data stored in 2 0 obj.

To make the form auto-submit itself once the document is opened and decrypted, an OpenAction can be applied. Note that the object which contains the URL (http://p.df) for form submission is not encrypted and completely controlled by the attacker. As a result, as soon as the victim opens the PDF file and decrypts it, the OpenAction will be executed by sending the decrypted content of 2 0 obj to (http://p.df).

Exfiltration via Hyperlinks (A2)

If forms are not supported by the PDF viewer, there is a second method to achieve direct exfiltration of a plaintext. The PDF standard allows setting a "base" URI in the Catalog object used to resolve all relative URIs in the document.

This enables an attacker to define the encrypted part as a relative URI to be leaked to the attacker's web server. Therefore the base URI will be prepended to each URI called within the PDF file. In the given example, we set the base URI to (http://p.df).

The plaintext can be leaked by clicking on a visible element such as a link, or without user interaction by defining a URI Action to be automatically performed once the document is opened.

In the given example, we define the base URI within an Object Stream, which allows objects of arbitrary type to be embedded within a stream. This construct is a standard compliant method to put unencrypted and encrypted strings within the same document. Note that for this attack variant, only strings can be exfiltrated due to the specification, but not streams; (relative) URIs must be of type string. However, fortunately (from an attacker's point of view), all encrypted streams in a PDF document can be re-written and defined as hex-encoded strings using the hexadecimal string notation.

Nevertheless, the attack has some notable drawbacks compared to Exfiltration via PDF Forms:

The attack is not silent. While forms are usually submitted in the background (by the PDF viewer itself), to open hyperlinks, most applications launch an external web browser.
Compared to HTTP POST, the length of HTTP GET requests, as invoked by hyperlinks, is limited to a certain size.
PDF viewers do not necessarily URL-encode binary strings, making it difficult to leak compressed data.

Exfiltration via JavaScript (A3)

The PDF JavaScript reference allows JavaScript code within a PDF document to directly access arbitrary string/stream objects within the document and leak them with functions such as *getDataObjectContents* or *getAnnots*.

In the given example, the stream object 7 is given a Name (x), which is used to reference and leak it with a JavaScript action that is automatically triggered once the document is opened. The attack has some advantages compared to Exfiltration via PDF Forms and Exfiltration via Hyperlinks, such as the flexibility of an actual programming language.

It must, however, be noted that – while JavaScript actions are part of the PDF specification – various PDF applications have limited JavaScript support or disable it by default (e.g., Perfect PDF Reader).

Attack 2: CBC Gadgets (Attack B)

Not all PDF viewers support partially encrypted documents, which makes them immune to direct exfiltration attacks. However, because PDF encryption generally defines no authenticated encryption, attackers may use CBC gadgets to exfiltrate plaintext. The basic idea is to modify the plaintext data directly within an encrypted object, for example, by prefixing it with an URL. The CBC gadget attack, thus does not necessarily require cross-object references.

Note that all gadget-based attacks modify existing encrypted content or create new content from CBC gadgets. This is possible due to the malleability property of the CBC encryption mode.

This attack has two necessary preconditions:

Known plaintext: To manipulate an encrypted object using CBC gadgets, a known plaintext segment is necessary. For AESV3 – the most recent encryption algorithm – this plain- text is always given by the Perms entry. For older versions, known plaintext from the object to be exfiltrated is necessary.
Exfiltration channel: One of the interactive features: PDF Forms or Hyperlinks.

These requirements differ from those of the direct exfiltration attacks, because the attacks are applied "through" the encryption layer and not outside of it.

Exfiltration via PDF Forms (B1)

As described above, PDF allows the submission of string and stream objects to a web server. This can be used in conjunction with CBC gadgets to leak the plaintext to an attacker-controlled server, even if partial encryption is not allowed.

A CBC gadget constructed from the known plaintext can be used as the submission URL, as shown in the example on the left side. The construction of this particular URL gadget is challenging. As PDF encryption uses PKCS#5 padding, constructing the URL using a single gadget from the known Perms plaintext is difficult, as the last 4 bytes that would need to contain the padding are unknown.

However, we identified two techniques to solve this. On the one hand, we can take the last block of an unknown ciphertext and append it to our constructed URL, essentially reusing the correct PKCS#5 padding of the unknown plaintext. Unfortunately, this would introduce 20 bytes of random data from the gadgeting process and up to 15 bytes of the unknown plaintext to the end of our URL.

On the other hand, the PDF standard allows the execution of multiple OpenActions in a document, allowing us to essentially guess the last padding byte of the Perms value. This is possible by iterating over all 256 possible values of the last plaintext byte to get 0x01, resulting in a URL with as little random as possible (3 bytes). As a limitation, if one of the 3 random bytes contains special characters, the form submission URL might break.

Exfiltration via Hyperlinks (B2)

Using CBC gadgets, encrypted plaintext can be prefixed with one or more chosen plaintext blocks. An attacker can construct URLs in the encrypted PDF document that contain the plaintext to exfiltrate. This attack is similar to the exfiltration hyperlink attack (A2). However, it does not require the setting of a "base" URI in plaintext to achieve exfiltration.

The same limitations described for direct exfiltration based on links (A2) apply. Additionally, the constructed URL contains random bytes from the gadgeting process, which may prevent the exfiltration in some cases.

Exfiltration via Half-Open Object Streams (B3)

While CBC gadgets are generally restricted to the block size of the underlying block cipher – and more specifically the length of the known plaintext, in this case, 12 bytes – longer chosen plaintexts can be constructed using compression. Deflate compression, which is available as a filter for PDF streams, allows writing both uncompressed and compressed segments into the same stream. The compressed segments can reference back to the uncompressed segments and achieve the repetition of byte strings from these segments. These backreferences allow us to construct longer continuous plaintext blocks than CBC gadgets would typically allow for. Naturally, the first uncompressed occurrence of a byte string still appears in the decompressed result. Additionally, if the compressed stream is constructed using gadgets, each gadget generates 20 random bytes that appear in the decompressed stream. A non-trivial obstacle is to keep the PDF viewer from interpreting these fragments in the decompressed stream. While hiding the fragments in comments is possible, PDF comments are single-line and are thus susceptible to newline characters in the random bytes. Therefore, in reality, the length of constructed compressed plaintexts is limited.

To deal with this caveat, an attacker can use ObjectStreams which allow the storage of arbitrary objects inside a stream. The attacker uses an object stream to define new objects using CBC gadgets. An object stream always starts with a header of space-separated integers which define the object number and the byte offset of the object inside the stream. The dictionary of an object stream contains the key First which defines the byte offset of the first object inside the stream. An attacker can use this value to create a comment of arbitrary size by setting it to the first byte after their comment.

Using compression has the additional advantage that compressed, encrypted plaintexts from the original document can be embedded into the modified object. As PDF applications often create compressed streams, these can be incorporated into the attacker-created compressed object and will therefore be decompressed by the PDF applications. This is a significant advantage over leaking the compressed plaintexts without decompression as the compressed bytes are often not URL-encoded correctly (or at all) by the PDF applications, leading to incomplete or incomprehensible plaintexts. However, due to the inner workings of the deflate algorithms, a complete compressed plaintext can only be prefixed with new segments, but not postfixed. Therefore, a string created using this technique cannot be terminated using a closing bracket, leading to a half-open string. This is not a standard compliant construction, and PDF viewers should not accept it. However, a majority of PDF viewers accept it anyway.

Evaluation

During our security analysis, we identified two standard compliant attack classes which break the confidentiality of encrypted PDF files. Our evaluation shows that among 27 widely-used PDF viewers, all of them are vulnerable to at least one of those attacks, including popular software such as Adobe Acrobat, Foxit Reader, Evince, Okular, Chrome, and Firefox.

You can find the detailed results of our evaluation here.

What is the root cause of the problem?

First, many data formats allow to encrypt only parts of the content (e.g., XML, S/MIME, PDF). This encryption flexibility is difficult to handle and allows an attacker to include their own content, which can lead to exfiltration channels.

Second, when it comes to encryption, AES-CBC – or encryption without integrity protection in general – is still widely supported. Even the latest PDF 2.0 specification released in 2017 still relies on it. This must be fixed in future PDF specifications and any other format encryption standard, without enabling backward compatibility that would re-enable CBC gadgets.

A positive example is JSON Web Encryption standard, which learned from the CBC attacks on XML and does not support any encryption algorithm without integrity protection.

Authors of this Post

Jens Müller
Fabian Ising
Vladislav Mladenov
Christian Mainka
Sebastian Schinzel
Jörg Schwenk

Acknowledgements

Many thanks to the CERT-Bund team for the great support during the responsible disclosure process.Related articles

DartConnect

A list of must read articles on OWASP API Security Project:

10/3/19, APISecurity.IO (UMV: 1,510): Issue 51: Gartner Releases Full Report on API Security
10/2/19, ADT Magazine (UMV: 117,500): API Security Project Identifies Top 10 Vulnerabilities
9/26/19, Dark Reading (UMV: 57,800): Why You Need to Think About API Security
9/24/19, The Daily Swig (UMV: 30,500): OWASP Reveals Top 10 Security Threats Facing API Ecosystem
9/20/19, Security Boulevard (UMV: 29,100): New OWASP List Highlights API Security Holes
9/13/19, Security Boulevard (UMV: 29,100): Why You Need to Be Thinking About API Security
9/13/19, CyberWire (UMV: 49,380): Daily Briefing: OWASP API Security Project
9/12/19, Dark Reading (UMV: 57,800): APIs Get Their Own Top 10 Security list

Also included in Dark Reading's weekly newsletter on 9/19/19

Read more

DartConnect

All commands (A-Z) for Kali Linux here:
A
apropos Search Help manual pages (man -k)
apt-get Search for and install software packages (Debian/Ubuntu)
aptitude Search for and install software packages (Debian/Ubuntu)
aspell Spell Checker
awk Find and Replace text, database sort/validate/index
B
basename Strip directory and suffix from filenames
bash GNU Bourne-Again SHell
bc Arbitrary precision calculator language
bg Send to background
break Exit from a loop •
builtin Run a shell builtin
bzip2 Compress or decompress named file(s)
C
cal Display a calendar
case Conditionally perform a command
cat Concatenate and print (display) the content of files
cd Change Directory
cfdisk Partition table manipulator for Linux
chgrp Change group ownership
chmod Change access permissions
chown Change file owner and group
chroot Run a command with a different root directory
chkconfig System services (runlevel)
cksum Print CRC checksum and byte counts
clear Clear terminal screen
cmp Compare two files
comm Compare two sorted files line by line
command Run a command – ignoring shell functions •
continue Resume the next iteration of a loop •
cp Copy one or more files to another location
cron Daemon to execute scheduled commands
crontab Schedule a command to run at a later time
csplit Split a file into context-determined pieces
cut Divide a file into several parts
D
date Display or change the date & time
dc Desk Calculator
dd Convert and copy a file, write disk headers, boot records
ddrescue Data recovery tool
declare Declare variables and give them attributes •
df Display free disk space
diff Display the differences between two files
diff3 Show differences among three files
dig DNS lookup
dir Briefly list directory contents
dircolors Colour setup for `ls'
dirname Convert a full pathname to just a path
dirs Display list of remembered directories
dmesg Print kernel & driver messages
du Estimate file space usage
E
echo Display message on screen •
egrep Search file(s) for lines that match an extended expression
eject Eject removable media
enable Enable and disable builtin shell commands •
env Environment variables
ethtool Ethernet card settings
eval Evaluate several commands/arguments
exec Execute a command
exit Exit the shell
expect Automate arbitrary applications accessed over a terminal
expand Convert tabs to spaces
export Set an environment variable
expr Evaluate expressions
F
false Do nothing, unsuccessfully
fdformat Low-level format a floppy disk
fdisk Partition table manipulator for Linux
fg Send job to foreground
fgrep Search file(s) for lines that match a fixed string
file Determine file type
find Search for files that meet a desired criteria
fmt Reformat paragraph text
fold Wrap text to fit a specified width.
for Expand words, and execute commands
format Format disks or tapes
free Display memory usage
fsck File system consistency check and repair
ftp File Transfer Protocol
function Define Function Macros
fuser Identify/kill the process that is accessing a file
G
gawk Find and Replace text within file(s)
getopts Parse positional parameters
grep Search file(s) for lines that match a given pattern
groupadd Add a user security group
groupdel Delete a group
groupmod Modify a group
groups Print group names a user is in
gzip Compress or decompress named file(s)
H
hash Remember the full pathname of a name argument
head Output the first part of file(s)
help Display help for a built-in command •
history Command History
hostname Print or set system name
I
iconv Convert the character set of a file
id Print user and group id's
if Conditionally perform a command
ifconfig Configure a network interface
ifdown Stop a network interface
ifup Start a network interface up
import Capture an X server screen and save the image to file
install Copy files and set attributes
J
jobs List active jobs •
join Join lines on a common field
K
kill Stop a process from running
killall Kill processes by name
L
less Display output one screen at a time
let Perform arithmetic on shell variables •
ln Create a symbolic link to a file
local Create variables •
locate Find files
logname Print current login name
logout Exit a login shell •
look Display lines beginning with a given string
lpc Line printer control program
lpr Off line print
lprint Print a file
lprintd Abort a print job
lprintq List the print queue
lprm Remove jobs from the print queue
ls List information about file(s)
lsof List open files
M
make Recompile a group of programs
man Help manual
mkdir Create new folder(s)
mkfifo Make FIFOs (named pipes)
mkisofs Create an hybrid ISO9660/JOLIET/HFS filesystem
mknod Make block or character special files
more Display output one screen at a time
mount Mount a file system
mtools Manipulate MS-DOS files
mtr Network diagnostics (traceroute/ping)
mv Move or rename files or directories
mmv Mass Move and rename (files)
N
netstat Networking information
nice Set the priority of a command or job
nl Number lines and write files
nohup Run a command immune to hangups
notify-send Send desktop notifications
nslookup Query Internet name servers interactively
O
open Open a file in its default application
op Operator access
P
passwd Modify a user password
paste Merge lines of files
pathchk Check file name portability
ping Test a network connection
pkill Stop processes from running
popd Restore the previous value of the current directory
pr Prepare files for printing
printcap Printer capability database
printenv Print environment variables
printf Format and print data •
ps Process status
pushd Save and then change the current directory
pwd Print Working Directory
Q
quota Display disk usage and limits
quotacheck Scan a file system for disk usage
quotactl Set disk quotas
R
ram ram disk device
rcp Copy files between two machines
read Read a line from standard input •
readarray Read from stdin into an array variable •
readonly Mark variables/functions as readonly
reboot Reboot the system
rename Rename files
renice Alter priority of running processes
remsync Synchronize remote files via email
return Exit a shell function
rev Reverse lines of a file
rm Remove files
rmdir Remove folder(s)
rsync Remote file copy (Synchronize file trees)
S
screen Multiplex terminal, run remote shells via ssh
scp Secure copy (remote file copy)
sdiff Merge two files interactively
sed Stream Editor
select Accept keyboard input
seq Print numeric sequences
set Manipulate shell variables and functions
sftp Secure File Transfer Program
shift Shift positional parameters
shopt Shell Options
shutdown Shutdown or restart linux
sleep Delay for a specified time
slocate Find files
sort Sort text files
source Run commands from a file `.'
split Split a file into fixed-size pieces
ssh Secure Shell client (remote login program)
strace Trace system calls and signals
su Substitute user identity
sudo Execute a command as another user
sum Print a checksum for a file
suspend Suspend execution of this shell •
symlink Make a new name for a file
sync Synchronize data on disk with memory
T
tail Output the last part of file
tar Tape ARchiver
tee Redirect output to multiple files
test Evaluate a conditional expression
time Measure Program running time
times User and system times
touch Change file timestamps
top List processes running on the system
traceroute Trace Route to Host
trap Run a command when a signal is set(bourne)
tr Translate, squeeze, and/or delete characters
true Do nothing, successfully
tsort Topological sort
tty Print filename of terminal on stdin
type Describe a command •
U
ulimit Limit user resources •
umask Users file creation mask
umount Unmount a device
unalias Remove an alias •
uname Print system information
unexpand Convert spaces to tabs
uniq Uniquify files
units Convert units from one scale to another
unset Remove variable or function names
unshar Unpack shell archive scripts
until Execute commands (until error)
uptime Show uptime
useradd Create new user account
userdel Delete a user account
usermod Modify user account
users List users currently logged in
uuencode Encode a binary file
uudecode Decode a file created by uuencode
V
v Verbosely list directory contents (`ls -l -b')
vdir Verbosely list directory contents (`ls -l -b')
vi Text Editor
vmstat Report virtual memory statistics
W
wait Wait for a process to complete •
watch Execute/display a program periodically
wc Print byte, word, and line counts
whereis Search the user's $path, man pages and source files for a program
which Search the user's $path for a program file
while Execute commands
who Print all usernames currently logged in
whoami Print the current user id and name (`id -un')
wget Retrieve web pages or files via HTTP, HTTPS or FTP
write Send a message to another user
X
xargs Execute utility, passing constructed argument list(s)
xdg-open Open a file or URL in the user's preferred application.
yes Print a string until interrupted
. Run a command script in the current shell
!! Run the last command again

Continue reading