Snippet: Converting SVG with embedded CSS fonts to PDF
This is a snippet of Python code that can be used as a framework for converting SVG files with embedded CSS fonts (@font-face
) to PDF, maintaining the font embedding.
import base64
import os
import re
import subprocess
import uuid
#from fontTools.ttLib import TTFont
#from io import BytesIO
with open('svgfile.svg', 'r') as f:
svg = f.read()
fonts = re.findall(r'@font-face \{.*?\}', svg)
for font in fonts:
name = re.search(r'font-family: "(.*?)"', font).group(1)
data = re.search(r'src: url\(data:font/woff;base64,(.*?)\);', font).group(1)
payload = base64.b64decode(data.encode('utf-8'))
with open('tmp/{}.woff'.format(name), 'wb') as outf:
outf.write(payload)
#fontf = BytesIO(payload)
#tfont = TTFont(fontf)
#tname = tfont['name'].names[1].string.decode('utf-8')
# Generate a random name to bypass caching
tname = str(uuid.uuid4())
#print('Extracting font {}'.format(tname))
svg = svg.replace(font, '')
svg = svg.replace('"' + name + '"', '"' + tname + '"')
# Convert to TTF
# Cairo(?) has difficulty embedding fonts with OTF
subprocess.run(['fontforge', '-script', 'ffconvert.pe', 'tmp/{}.woff'.format(name), '/home/runassudo/.fonts/tmp/{}.ttf'.format(name), tname])
font_files.append('/home/runassudo/.fonts/tmp/{}.ttf'.format(name))
os.remove('tmp/{}.woff'.format(name))
with open('tmp/svgfile.svg', 'w') as f:
print(svg, file=f)
subprocess.run(['inkscape', '-A', 'svgfile.pdf', 'tmp/svgfile.svg'])
for font_file in font_files:
os.remove(font_file)
os.remove('tmp/svgfile.svg')
The ffconvert.pe file has the following contents:
Open($1,1)
SetFontNames($3,$3,$3)
SetTTFName(0x409,3,$3)
Generate($2)
The script parses (very badly) the SVG file, looking for @font-face
declarations. It then extracts the embedded font (here, a WOFF font is assumed), calling FontForge to produce a TTF version of the font in ~/.fonts, replacing the font name with a random name (to avoid caching issues). It then alters the font names in the SVG file and removes the @font-face
declaration. It then performs the PDF conversion, and deletes the intermediate files.
Notes
Dear people from the future, here's what we've figured out so far…
- librsvg doesn't like hotswapping fonts for some reason.
- Inkscape (Cairo?) doesn't like embedding OTF fonts in PDF. It renders properly in Inkscape but no text is produced in the PDF.
- Something somewhere caches font glyphs, so if there are two different font files with the same font name, glyphs present in the second but not the first would not render properly – hence why the script randomises font names.
- It turns out you can override fontconfig on a per-application basis using the FONTCONFIG_FILE environment variable. This could provide a better way of loading fonts without cluttering the global font directory.