Background

This post concerns a DRM system used in an online ebook platform, released circa 2018. Users of the platform can purchase ebooks and either view them online, or download them for offline viewing using a proprietary Android/iOS app.

As usual, the particular DRM system is not relevant and will not be identified. (Last time, some clever readers were able to identify the system. This time I've made it extra challenging!)

Intercepting SSL requests

The app downloads the ebooks using SSL, so to intercept these requests we will need to be able to decrypt them. Android packet capture apps did not work well, so I used mitmproxy:

$ mitmdump --set block_global=false --set stream_large_bodies=1
Proxy server listening at http://*:8080

After configuring Android to use this as a proxy server, we can navigate to http://mitm.it and install the mitmproxy SSL certificate authority (CA).

Even after doing this, however, we face an additional hurdle: Custom CAs are usually installed on Android as a ‘user’ credential as opposed to a ‘system’ credential, and since Android N, ‘user’ CAs are no longer trusted by default unless an app explicitly opts-in.

There are some Xposed modules that claim to be able to bypass certificate validation, but I had very mixed success with these. Instead, I copied the ‘user’ CAs into the ‘system’ store through MagiskTrustUserCerts.

Intercepting the downloads

Now we can proceed to intercepting the download. Opening the ebook reader app and attempting to download a book, we see the following request:

10.0.0.2:48666: clientconnect
Streaming response from us-east-1-foobar-live-media.s3.amazonaws.com
10.0.0.2:48666: GET https://us-east-1-foobar-live-media.s3.amazonaws.com/mobile/v3/foo.zip?AWSAccessKeyId=FK4DQ9D3COS4SCMZK2WF&Expires=1553758482&x-amz-security-token=AgoJb3...
                << 200 OK (content missing)

We can now download this file, foo.zip, ourselves. It appears to be a simple ZIP file, just over 100 MB large, so we can extract the contents:

$ mkdir foo; cd foo
$ unzip ../foo.zip
Archive:  ../foo.zip
  inflating: 169573_part3.png
  inflating: 636460165358463957.png
  inflating: 636460698359428493.png
  ...

Examining the contents, the archive appears to contain mostly assets – it has two folders, css and images, plus a large number of loose images strewn about the root directory. There is no text in sight!

There is, however, a suspiciously-large foo.db3 file in the root directory, about 25 MB large, but the filetype is unrecognised:

$ file foo.db3
foo.db3: data

Opening the file in a hex editor reveals a random assortment of bytes, with no apparent order. More investigation is required.

Inspecting the application Java code

We can obtain a copy of the Android app APK file by copying it from /data/app/com.example.ereader/base.apk, or by obtaining it from a third-party APK website. The APK file is simply another ZIP file, so we can it extract it too.

The compiled Java bytecode is packaged in a classes.dex file inside the APK. We can use dex2jar to convert this to a normal JAR file, then use a Java decompiler like JD-GUI to decompile it.

Looking at the decompiled source code, however, there does not seem to be anything interesting. There is a little bit of application boilerplate, some networking-related code, but nothing that seems to help with making sense of this ‘db3’ file.

Inspecting the APK native libraries

Looking elsewhere in the APK, we see a number of native code libraries. Inside an assemblies folder, we find a number of interesting files: FooBar.Common.dll, FooBar.Common.dll.config, Microsoft.CSharp.dll, Mono.Android.dll, Xamarin.Android.Support.v4.dll, etc.

What? DLL files on a Linux-based platform?!

These files relate to a Xamarin.Android (Mono for Android) application, designed for cross-platform development. One imagines that if one looked into the iOS application, there would be similar files.

Looking instead into the lib/armeabi-v7a folder, we see some more usual-looking files: a few libmono*.so libraries, libe_sqlite3.so and libsqlcipher.so.

With a little Googling, we find that libsqlcipher.so is from SQLCipher, ‘an open source extension to SQLite that provides transparent 256-bit AES encryption of database files’. Aha! So could foo.db3 be an SQLite 3 database which has been encrypted?

Inspecting the symbols in libsqlcipher.so

The usual tools for listing the symbols in a .so file, nm and objdump, had a little trouble on this file, but we can use readelf to successfully inspect the contents:

$ arm-none-eabi-readelf -Ws libsqlcipher.so
Symbol table '.dynsym' contains 2563 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FUNC    GLOBAL DEFAULT  UND __cxa_finalize@LIBC (2)
     2: 00000000     0 FUNC    GLOBAL DEFAULT  UND __cxa_atexit@LIBC (2)
     3: 0002d4f1  1916 FUNC    GLOBAL DEFAULT   12 sqlcipher_codec_pragma
     4: 0002dc6d    92 FUNC    GLOBAL DEFAULT   12 sqlite3_stricmp
     5: 0002dcc9    84 FUNC    GLOBAL DEFAULT   12 sqlite3_mprintf
     6: 0002dd85   132 FUNC    GLOBAL DEFAULT   12 sqlite3_free
     ...

If the Android application is using this library to access the encrypted database, then we should be able to attach a debugger and set breakpoints in these library functions to intercept the encryption parameters!

From the SQLite and SQLCipher API documentation, we identify a few functions that may be interesting: sqlite3_open, sqlite3_open_v2, sqlite3_exec, sqlite3_key and sqlite3_key_v2.

Debugging with GDB

We can now prepare to attach a debugger to the Android application. The Android NDK comes with prebuilt GDB server binaries for Android, so we can push these over to the phone, start the ebook reader app, and attach the debugger:

$ adb push /opt/android-ndk/prebuilt/android-arm/gdbserver/gdbserver /data/local/tmp
/opt/android-ndk/prebuilt/android-arm/gdbserver/gdbserver: 1 file pushed. 12.9 MB/s (596448 bytes in 0.044s)
$ adb shell
jasmine_sprout:/ $ su
jasmine_sprout:/ # ps -A | grep ereader
u0_a92       14020   702 1545124 280320 SyS_epoll_wait f2ebe8cc S com.example.ereader
jasmine_sprout:/ # /data/local/tmp/gdbserver --attach localhost:1234 14020
Attached; pid = 14020
Listening on port 1234
$ adb forward tcp:1234 tcp:1234
$ aarch64-linux-gnu-gdb
GNU gdb (GDB) 8.2.1
Copyright (C) 2018 Free Software Foundation, Inc.
...
(gdb) target remote :1234
Remote debugging using :1234
Reading /system/bin/app_process32 from remote target...
warning: File transfers from remote targets can be slow. Use "set sysroot" to access files locally instead.
Reading /system/bin/app_process32 from remote target...
Reading symbols from target:/system/bin/app_process32...Reading symbols from .gnu_debugdata for target:/system/bin/app_process32...(no debugging symbols found)...done.
(no debugging symbols found)...done.
Reading /system/bin/linker from remote target...
Reading /system/lib/libandroid_runtime.so from remote target...
Reading /system/lib/libbase.so from remote target...
...
Reading /system/bin/linker from remote target...
0xf2ebe8cc in __epoll_pwait () from target:/system/lib/libc.so
(gdb)

Now we set some breakpoints for the interesting functions we identified earlier:

(gdb) b sqlite3_open
Breakpoint 1 at 0xce700df0 (2 locations)
(gdb) b sqlite3_open_v2
Breakpoint 2 at 0xce701674 (2 locations)
(gdb) b sqlite3_exec
Breakpoint 3 at 0xce6f58fc (2 locations)
(gdb) b sqlite3_key
Breakpoint 4 at 0xce6f3996
(gdb) b sqlite3_key_v2
Breakpoint 5 at 0xce6f39ac
(gdb)

Finally, ‘Mono support libraries use a couple of signals internally that confuse gdb’, so we need to instruct GDB to ignore them:

(gdb) handle SIGXCPU SIG33 SIG35 SIG36 SIG37 SIG38 SIGPWR nostop noprint
Signal        Stop      Print   Pass to program Description
SIGXCPU       No        No      Yes             CPU time limit exceeded
SIGPWR        No        No      Yes             Power fail/restart
SIG33         No        No      Yes             Real-time event 33
SIG35         No        No      Yes             Real-time event 35
SIG36         No        No      Yes             Real-time event 36
SIG37         No        No      Yes             Real-time event 37
SIG38         No        No      Yes             Real-time event 38
(gdb)

Now we can resume execution, tap into the ebook, and wait for our breakpoints to be triggered:

(gdb) cont
Continuing.
[New Thread 15281.15528]

Thread 1 "example.ereader" hit Breakpoint 2, 0xce701674 in sqlite3_open_v2 ()
   from target:/data/app/com.example.ereader/lib/arm/libsqlcipher.so
(gdb)

Bingo, we've hit sqlite3_open_v2. From the SQLite documentation, the signature of this function is:

int sqlite3_open_v2(
  const char *filename,   /* Database filename (UTF-8) */
  sqlite3 **ppDb,         /* OUT: SQLite db handle */
  int flags,              /* Flags */
  const char *zVfs        /* Name of VFS module to use */
);

Being unfamiliar with ARM, at this point I initially went on a wild goose chase poking around the stack looking for the arguments to this function (like in x86), but the ARM calling convention is to place the arguments in the registers:

(gdb) info reg
r0             0xd23f9bd8          3527384024
r1             0xffccdd70          4291616112
r2             0x1                 1
r3             0x0                 0
r4             0xd239d7e0          3527006176
r5             0xd23f9a78          3527383672
r6             0xd23f4698          3527362200
r7             0x0                 0
r8             0xec7bd990          3967539600
r9             0xf16cc000          4050436096
r10            0x1                 1
r11            0xd23f9bd8          3527384024
r12            0xce701675          3463452277
sp             0xffccdc50          0xffccdc50
lr             0xec657920          -328894176
pc             0xce701674          0xce701674 <sqlite3_open_v2>
cpsr           0x600a0030          1611268144
fpscr          0x20000013          536870931
(gdb)

Comparing with the function signature, it looks like r0 to r3 could plausibly be a match to what we expect: two pointers (filename = 0xd23f9bd8, ppDb = 0xffccdd70), a bitmask flag (flags = 0x1), and a null pointer (zVfs = NULL).

Examining the data at 0xd23f9bd8:

(gdb) x/s 0xd23f9bd8
0xd23f9bd8:     "/data/user/0/com.example.ereader/files/sources/foo/foo.db3"
(gdb)

Yes, we are indeed in the right place! Now let's see if we can find something related to encryption:

(gdb) cont
Continuing.
[New Thread 15281.15607]
[New Thread 15281.15608]
[New Thread 15281.15609]
[New Thread 15281.15610]
[New Thread 15281.15611]
[New Thread 15281.15612]
[New Thread 15281.15613]
[New Thread 15281.15614]
[New Thread 15281.15615]
[New Thread 15281.15616]
[New Thread 15281.15617]

Thread 1 "example.ereader" hit Breakpoint 5, 0xce6f39ac in sqlite3_key_v2 ()
   from target:/data/app/com.example.ereader/lib/arm/libsqlcipher.so
(gdb) info reg
r0             0xceac9188          3467415944
r1             0x0                 0
r2             0xcd5af608          3445290504
r3             0x41                65
r4             0x41                65
r5             0xcd5af608          3445290504
r6             0xceac9188          3467415944
r7             0xffccd8e8          4291614952
r8             0xcd283058          3441963096
r9             0x1f                31
r10            0xcfb26a48          3484576328
r11            0x25                37
r12            0xce7eeb18          3464424216
sp             0xffccd8d8          0xffccd8d8
lr             0xce72d987          -831334009
pc             0xce6f39ac          0xce6f39ac <sqlite3_key_v2+12>
cpsr           0x200d0030          537722928
fpscr          0x80000013          -2147483629
(gdb)

Now we've hit sqlite3_key_v2, a function responsible for telling SQLCipher the encryption/decryption key for the database! From the documentation, the relevant function signature is:

int sqlite3_key_v2(
  sqlite3 *db,                   /* Database to be rekeyed */
  const char *zDbName,           /* Name of the database */
  const void *pKey, int nKey     /* The key */
);

That means the third argument, stored in r2, should be a pointer to the key:

(gdb) x/s 0xcd5af608
0xcd5af608:     "correct_horse_battery_staple_blah_other_random_words_here_foo_bar"
(gdb)

Well, that's certainly one way of constructing an encryption key. (This is a placeholder example, but the real key was of this format.)

Putting it all together

Now that we've identified the encryption key, we should now be able to access the foo.db3 SQLite database using the SQLCipher command line tools:

$ sqlcipher foo.db3
SQLCipher version 3.26.0 2018-12-01 12:34:55
sqlite> .dbinfo
unable to read database header
sqlite> PRAGMA key='correct_horse_battery_staple_blah_other_random_words_here_foo_bar';
sqlite> PRAGMA cipher_migrate;
0
sqlite> .dbinfo
database page size:  4096
write format:        1
read format:         1
reserved bytes:      80
file change counter: 1098255264
database page count: 5853
freelist page count: 0
schema cookie:       112
schema format:       4
default cache size:  0
autovacuum top root: 0
incremental vacuum:  0
text encoding:       1 (utf8)
user version:        0
application id:      0
software version:    3026000
number of tables:    103
number of indexes:   10
number of triggers:  0
number of views:     0
schema size:         30350
data version         6
sqlite>

All that remains now is to decrypt the data to a plain SQLite database:

sqlite> ATTACH DATABASE 'plaintext.db' AS plaintext KEY '';
sqlite> SELECT sqlcipher_export('plaintext');
sqlite>

Et voilà!