6 Free Local Tools for Analyzing Malicious PDF Files
Malicious PDF files are frequently used as part of targeted and mass-scale computer attacks. Being able to analyze PDFs to understand the associated threats is an increasingly important skill for security incident responders and digital forensic analysts. Here are 6 free tools you can install on your system and use for this purpose.
Analyzing a PDF file involves examining, decoding and extracting contents of suspicious PDF objects that may be used to exploit a vulnerability in Adobe Reader and execute malicious payload. There is an increasing number of tools that are designed to assist with this process, including the following:
- PDF Tools by Didier Stevens is the classic toolkit that established the foundation for our understanding of the PDF analysis process. It includes pdfid.py to quickly scan the PDF for risky objects and, most usefully, pdf-parser.py to examine their contents.
- PDF Stream Dumper by “Dave” is a powerful Windows program that combines a number of PDF analysis tools under a unified GUI. It makes it possible to explore PDF contents, decode object contents, deobfuscate JavaScript, examine shellcode, etc.
- Jsunpack-n by Blake Hartstein is a command-line tool that emulates a browser when analyzing malicious websites. In addition to supporting numerous other features, the tool includes the pdf.py script for extracting JavaScript embedded in PDF files.
- Peepdf by Jose Miguel Esparza is an interactive command-line tool that allows users to explore and analyze contents of PDF files. Its features include examining the file’s structure, analyzing object contents, as well as decoding embedded JavaScript and shellcode.
- Origami by Guillaume Delugré and Fred Raynal is a Ruby framework for parsing, analyzing and creating PDF files. In addition to providing programmers with the ability to automate PDF interactions, it includes a the pdfscan.rb script to scan the PDF for risky objects and the extractjs.rb to extract JavaScript embedded in the file.
- MalObjClass by Brandon Dixon provides a Python framework for building a JSON object the represents components of a PDF file. This capability allows programmers to easily parse, examine and decode malicious PDF objects. The tool even includes the ability to scan the file with VirusTotal.
If you know of other tools that work well for analyzing malicious PDF files and that can be installed locally, please leave a comment.
My other articles related to PDF file analysis:
- Analyzing Suspicious PDF Files With PDF Stream Dumper
- How to Extract Flash Objects from Malicious PDF Files
- Analyzing Malicious Documents Cheat Sheet
- 6 Hex Editors for Malware Analysis
If you’re you’d like to learn how to analyze malicious PDFs, check out the Reverse-Engineering Malware course I teach at SANS Institute.
Update: For another excellent free PDF analysis tool, take a look at my follow-up post Analyzing Suspicious PDF Files With Peepdf.
Updated May 10, 2011
Lenny Zeltser
Did you like this?
Follow me for more of the good stuff.
About the Author
I transform ideas into successful outcomes, building on my 25 years of experience in cybersecurity. As the CISO at Axonius, I lead the security program to earn customers' trust. I'm also a Faculty Fellow at SANS Institute, where I author and deliver training for incident responders. The diversity of cybersecurity roles I've held over the years and the accumulated expertise, allow me to create practical solutions that drive business growth.
More on
- Information Security
- Malicious Software