How to scan a PDF for malware? [closed]

Can anyone suggest an automated tool to scan a PDF file to determine whether it might contain malware or other "bad stuff"? Or, alternatively, assigns a risk level to the PDF? I would prefer a free tool. It must be suitable for programmatic use, e.g., from the Unix command line, so that it is possible to scan PDFs automatically and take action based upon that. A web-based solution might also be OK if it is scriptable.

asked Apr 5, 2011 at 6:52 100k 33 33 gold badges 278 278 silver badges 600 600 bronze badges

5 Answers 5

Didier Stevens has provided two open-source, Python-based scripts to perform PDF malware analysis. There are a few others that I will also highlight.

The primary ones you want to run first are PDFiD (available another with Didier's other PDF Tools) and Pyew.

Here is an article on how to run pdfid.py and see the expected results; Here is another for pyew.

Finally, after identifying possible JS, Javascript, AA, OpenAction, and AcroForms -- you will want to dump those objects, filter the Javascript, and produce a raw output. This is possible with pdf-parser.py.

Additionally, Brandon Dixon maintains some extremely elite blog posts on his research with PDF malware, including a post about scoring PDFs based on malicious filters just like you describe.

I, personally, run all of these tools!

103 4 4 bronze badges answered Apr 5, 2011 at 7:12 19.1k 6 6 gold badges 61 61 silver badges 109 109 bronze badges

+1 good reply, Didier Stevens is the main focus when looking at PDF based malware. He also has tools such as 'make-pdf-javascript.py' which enable you to create proof of concept malicious PDFs to understand their structure more fully and why they are an issue.

Commented Apr 5, 2011 at 7:55 Here is a newer tool with newer techniques from early 2016 -- honeynet.org/node/1304 Commented Feb 19, 2016 at 16:22 Updated techniques -- itsjack.cc/blog/2017/08/… Commented Aug 22, 2017 at 20:25 @atdre Using pdf-parser.py , how to find out if ObjStm s are malicious? Commented Apr 22, 2018 at 16:29 there are a few starting points here — zeltser.com/analyzing-malicious-documents Commented Apr 22, 2018 at 16:49

Just came by this very recent blog post by Lenny Zeltser which is pretty much right on the money

6 Free Tools for analyzing Malicious PDF Files

The tools he mentions are:

There are details about each one and links to other pdf analysis documents at the blog post.

answered May 14, 2011 at 0:26 11.1k 2 2 gold badges 38 38 silver badges 43 43 bronze badges

For the past few months I have been doing research on PDF analysis and how it could be better improved. While doing the research I found myself writing tools and scripts to help me get the job done and decided it was time to put something more useful together. PDF X-RAY is a static analysis tool that allows you to analyze PDF files through a web interface or API. The tool uses multiple open source tools and custom code to take a PDF and turn it into a sharable format. The goal with this tool is to centralize PDF analysis and begin sharing comments on files that are seen.

PDF X-RAY differs from all other tools because it doesn't focus on the single file. Instead it compares the file you upload against thousands of malicious PDF files in our repository. These checks look for similar data structures within the PDF you upload and ones that have been reviewed by analysts. Using this feature we can begin to see shared coded samples among malicious files or trends due to malicious author coding styles. The tool is still in beta, but I wanted to release it to the public to see what users thought. In my opinion the API is the most useful as you can begin to integrate rich PDF analysis into other tools and services with little or no cost.

Current features include: