Introducing gfx2gfx-pdftext, a fork of gfx2gfx designed to convert SWF files to PDF files, while preserving text! I am not aware of any other freely-available tools that can do this.

The original gfx2gfx from SWFTools unfortunately converts text to paths. An improvement on other tools, which simply convert everything to bitmaps (eugh), but nevertheless causing the resulting PDF file to be several times larger than the original SWF file, particularly for text-heavy files. Not any more!

To celebrate, here's the 2016 version of my ‘converting book-style SWFs to PDFs’ guide:

0. Preamble

So you've downloaded (legally?) digital copies of a textbook, but they're individual pages in SWF format. (Okay, it needn't be this specific) What now?

Just to be clear, before we continue, the kind of SWF files I'm talking about are static, one frame SWF files consisting of bitmap images and text. While swfrender from SWFTools could be used to convert these into PNGs/PDFs, the text wouldn't be scalable. This guide will take these static SWF files and turn them into PDF files with scalable text.

This process is applicable, for example, to some Pearson eBooks, which are stored by the Pearson eBook client as ZIP files containing SWFs. Oxford textbook ORB files are ZIP files containing lots of ‘GAU’ files. Renaming these files to SWF files will also allow this process to work.

1. gfx2gfx-pdftext

This guide will be using the undocumented tool gfx2gfx from SWFTools. As this tool is undocumented, it must be compiled from source.

The original gfx2gfx causes text to be converted to PDF paths. While this works, it means you wouldn't be able to select or search the text. Thus I've created a fork, gfx2gfx-pdftext, which causes text to be saved as text, resulting in smaller and more user-friendly PDF files. I've also made the necessary changes to compile gfx2gfx, to save you time.

  1. Install the dependencies pdflib-lite, giflib, freeglut, lame, t1lib, t1lib and fontconfig. The details in the Requirements section of these instructions may be helpful.

  2. Open a terminal window and cd to a directory of your choice.

  3. Run git clone https://github.com/RunasSudo/gfx2gfx-pdftext.git.

  4. From the folder with the configure file, run ./configure then make. If it causes an error, running make again sometimes fixes it.

  5. If all went well, there should be a gfx2gfx executable file inside the src/swftools/src directory. Copy it to the folder where your SWF files are.

If you're on Arch Linux, I also have a PKGBUILD.

2. Convert Time!

The complex part is over!

./gfx2gfx page_1.swf -o page_1.pdf

It's that easy! Or, if for loops are your thing,

for i in {1..350}; do ./gfx2gfx chapter0/page_$i.swf -o pdf/page_$i.pdf; done

gfx2gfx imposes a maximum DPI of 72 by default. If you think this is ridiculous, you can add an -r 300 parameter to raise the DPI to 300, or set -r 0 to disable the restriction.

Note that if the extension is not SWF (for example, for Oxford GAU files), you must first rename the files to SWF files, or else gfx2gfx will become confused:

for i in {1..350}; do mv $i.gau $i.swf; ./gfx2gfx $i.swf -o $i.pdf; done