Investigating a proprietary Android 2FA system

This article concerns an Android app used as part of a proprietary two-factor authentication (2FA) system. Investigation of the app and 2FA protocol reveals some interesting design decisions.

Overview

The 2FA system is similar to well-known offerings such as Duo Security and Okta Verify. When a user initiates a request (e.g. log in request), a push notification is delivered to the 2FA device. The user can then approve or reject the request, and the outcome is directly transmitted to the server. Therefore, the 2FA system is an interactive online protocol requiring internet connectivity to function, rather than an offline protocol like TOTP or HOTP.

The developer of the 2FA system advertises that the system is superior to other 2FA protocols, due to the use of cryptographic features – described as a ‘signature’ – that enable the server to determine not only that the user has approved a request, but also to verify that the user has approved the specific details of that particular request. As we will see, there are some interesting hidden details behind this description.

QR code message signing

Like many similar 2FA apps, the app is initialised by the user scanning a QR code on their mobile device. However, unlike standard TOTP or HOTP implementations, this app's registration QR codes do not directly contain the secrets required. An example of the content of the QR code might be:

{
	"signature": "...",
	"tag": "https://example.com/2fa/defaultConfig?username=alice",
	"command": {
		"opcode": 184,
		"data": "..."
	}
}

Decompiling the Android app using jadx, we identify the following method which appears to act on this QR code data:

public boolean x0(boolean z2, boolean z3) {
	// ...
	Intent S = A1.S(App.z1, App.K == null ? null : new String(App.K));
	// ...
	int intExtra = S.getIntExtra("opcode", -1);
	String stringExtra = S.getStringExtra("data");
	App.n = S.getStringExtra("tag");
	// ...
	String str2 = "&username=";
	if (App.n.lastIndexOf("&username=") < 0) {
		str2 = "?username=";
		if (App.n.lastIndexOf("?username=") < 0) {
			// ...
		}
	}
	// ...
}

It appears, then, that the QR code data is parsed in the A1.S method, which takes 2 String arguments. As has been previously described (1, 2), we can use Smali patching to log these arguments and the call stack:

.method public S(Ljava/lang/String;Ljava/lang/String;)Landroid/content/Intent;
	.locals 21
	
	# These lines inserted to log the arguments:
	const-string v0, "FOOBAR: A1.S"
	move-object/from16 v1, p1
	invoke-static {v0, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
	move-object/from16 v1, p2
	invoke-static {v0, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
	invoke-static {}, Ljava/lang/Thread;->dumpStack()V
	
	# ...

This reveals that the arguments to A1.S are the QR code data indeed, and an out-of-band ‘activation code’ provided to the user by SMS. Investigating the A1.S method decompilation, we find:

public Intent S(String str, String str2) {
	// ...
	JSONObject jSONObject3 = (JSONObject) new JSONTokener(str).nextValue();
	String string = jSONObject3.getString("signature");
	String string2 = jSONObject3.getString("tag");
	String string3 = jSONObject3.getString("command");
	JSONObject jSONObject4 = jSONObject3.getJSONObject("command");
	int i3 = jSONObject4.getInt("opcode");
	String string4 = jSONObject4.getString("data");
	if (!foobar.f.E(u("MASTER_SIG_KEY").e(this.f1987e, this.i), j(i3, string4), string)) {
		// ...
		return null;
	}
	// More interesting things ...
}

The main body of A1.S, then, is gated behind a call to foobar.f.E, which must succeed. Following the code further, u("MASTER_SIG_KEY").e(...) returns hex-encoded key data from a configuration file, and j(i3, string4) combines opcode and data into a JSON string in canonical form, which are then passed with the signature to the foobar.f.E method.

The foobar.f class contains a large number of static methods which appear to relate to cryptographic functions. The E method in particular reads:

public static boolean E(byte[] bArr, String str, String str2) {
	// ...
	PublicKey a2 = new foobar.r.a().a(bArr);
	Signature signature = Signature.getInstance("SHA256withRSA");
	signature.initVerify(a2);
	signature.update(y(str));
	return signature.verify(w3.k(str2.getBytes()));
}

Based on the magic string "SHA256withRSA", we can surmise that this code is verifying a signature using an RSA public key. A quick Google search also indicates that this signature scheme uses PKCS#1 v1.5 padding. We can re-implement this in Python and verify that the signature in the QR code data is valid:

from cryptography.hazmat.primitives.asymmetric.padding import PKCS1v15
from cryptography.hazmat.primitives.hashes import SHA256
from cryptography.hazmat.primitives.serialization import load_der_public_key

import json

MASTER_SIG_KEY = '...'

qr_json = json.loads('...')
qr_signature = qr_json['signature']
qr_opcode = qr_json['command']['opcode']
qr_data_str = qr_json['command']['data']

# Verify the signature
master_sig_key = load_der_public_key(bytes.fromhex(MASTER_SIG_KEY))
c_canonical = '{"opcode":' + str(qr_opcode) + ',"data":' + json.dumps(qr_data_str) + '}'  # Canonicalise the JSON
master_sig_key.verify(bytes.fromhex(qr_signature), c_canonical.encode('utf-8'), PKCS1v15(), SHA256())

QR code encrypted payload

Continuing with the decompilation of A1.S, we have:

public Intent S(String str, String str2) {
	// ...
	if (opcode == 184) {
		f.d l = l(string4);
		byte[] u = ((j0) l.v(0)).u();
		byte[] u2 = ((j0) l.v(1)).u();
		byte[] u3 = ((j0) l.v(2)).u();
		byte[] u4 = ((j0) l.v(3)).u();
		// ...
		byte[] n = foobar.f.n(u, str2.getBytes());
		byte[] h3 = foobar.f.h(u2, n, u);
		// ...
		Intent intent3 = new Intent();
		intent3.putExtra("opcode", i3);
		intent3.putExtra("data", foobar.e.b(h3));
		intent3.putExtra("tag", string2);
		return intent3;
	}
	// ...
}

The method l, to which the data value is passed, calls f.q4. The decompilation of this class contains a number of useful strings, such as "corrupted stream - invalid high tag number found". This is very helpful, because Googling that string identifies this class as an obfuscated version of ASN1InputStream from the Bouncy Castle API.

Using a tool such as Lapo Luchini's ASN.1 decoder, we can inspect the contents of this ASN.1 data from the QR code, which shows it to be a Sequence of four OctetStrings and a timestamp. These OctetStrings are presumably what is being read by the calls to ((j0) l.v(0)).u() and so on (close inspection of the decompilation of these methods would also enable this conclusion).

The first OctetString, then, is passed to foobar.f.n together with the ‘activation code’. foobar.f.n calls f.j2, which is a class also with a number of interesting constants – this time, integer constants:

public class j2 extends g2 {
	static final int[] f5395d = {1116352408, 1899447441, -1245643825, -373957723, /* ... */};
	// ...
}

A Google search shows that these constants are associated with the SHA-256 hash function, so we presume that foobar.f.n hashes the given arguments using SHA-256. A call is then made to foobar.f.h:

public static byte[] h(byte[] bArr, byte[] bArr2, byte[] bArr3) {
	int i = 0;
	r2 r2Var = new r2(new l2());
	c3 c3Var = new c3(bArr2, 0, bArr2.length);
	r2Var.j(false, new d3(c3Var, bArr3));
	byte[] bArr4 = new byte[bArr.length];
	int g = r2Var.g();
	while (true) {
		int i2 = i + g;
		if (i2 > bArr.length) {
			return bArr4;
		}
		r2Var.i(bArr, i, bArr4, i);
		i = i2;
	}
}

The l2 class again contains a range of magic numbers and constant strings, including the helpfully specific "AES engine not initialised", which identifies it as AESEngine from Bouncy Castle. With some investigation, we can also deobfuscate the other classes referred to; cleaned up, the method looks something like:

public static byte[] h(byte[] ciphertext, byte[] key, byte[] iv) {
	CBCBlockCipher cipher = new CBCBlockCipher(new AESEngine());
	KeyParameter key = new KeyParameter(key, 0, key.length);
	cipher.init(false /* decrypting */, new ParametersWithIV(key, iv));
	byte[] plaintext = new byte[ciphertext.length];
	for (int offset = 0; offset < ciphertext.length; offset += cipher.getBlockSize()) {
		cipher.processBlock(ciphertext, offset, plaintext, offset);
	}
	return plaintext;
}

Putting this all together, the final payload is encrypted using AES in CBC mode. The ciphertext is the second OctetString in the ASN.1 data. The first OctetString gives the IV, and is SHA-256 hashed with the ‘activation code’ to give the encryption key. We can similarly implement this in Python:

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes

from pyasn1.codec.der.decoder import decode as der_decode

# Unpack the ASN.1 payload
qr_asn1 = der_decode(bytes.fromhex(qr_data_str))[0]
iv = bytes(qr_asn1[0])
ciphertext = bytes(qr_asn1[1])

# Decrypt the payload
activation_code = '12345678'
hashobj = Hash(SHA256()); hashobj.update(iv); hashobj.update(activation_code.encode('utf-8'))
key = hashobj.finalize()

decryptor = Cipher(algorithms.AES(key), modes.CBC(iv)).decryptor()
final_payload = decryptor.update(ciphertext) + decryptor.finalize()

Following the code further reveals that the unencrypted final payload is transmitted back to the server, which responds with further information used in key exchange in the next section. In this way, the server can ensure that the user operating the 2FA app also has access to the ‘activation code’. Data from the third and fourth OctetStrings are similarly processed in various combinations of SHA-256 and AES to yield other secondary encryption keys, which are saved for later use.

At this point, some basic device information (e.g. model, version, device identifiers) is also transmitted to the server. Although all client-server communications are secured by TLS, additional protection is applied here – a random AES key is generated, and the device data is encrypted in CBC mode with PKCS#7 padding. The random AES key is then encrypted with the server's RSA public key obtained from a configuration file, using RSA/ECB/PKCS1Padding.¹ This provides additional protection against man-in-the-middle attacks, but curiously is not used anywhere else in the protocol, which relies on symmetric encryption only.

Key exchange

Having transmitted the unencrypted payload back to the server, therefore demonstrating possession of the ‘activation code’, the server responds with hex-encoded data labelled keys. The data begins with the byte 0x30, which is the ASN.1 tag for Sequence, so it appears this is more ASN.1 data. Specifically, it is composed of a PrintableString (later used as a ‘key ID’), 32-byte OctetString, and 8-bit BitString (used as a flags value).

Examining the decompiled app, the OctetString is eventually passed to a function foobar.f.j, to which is also passed one of the secondary encryption keys obtained in the previous stage:

public static byte[] j(byte[] bArr, byte[] bArr2) {
	l2 l2Var = new l2();
	l2Var.j(false, new c3(bArr));
	int g = a2.g();
	int length = bArr.length;
	byte[] bArr3 = new byte[length];
	for (int i = 0; i < length; i += g) {
		a2.i(bArr, i, bArr3, i);
	}
	return bArr3;
}

Deobfuscated, this reads:

public static byte[] j(byte[] ciphertext, byte[] key) {
	AESEngine engine = new AESEngine();
	engine.init(false /* decrypting */, new KeyParameter(key));
	byte[] plaintext = new byte[ciphertext.length];
	for (int offset = 0; offset < ciphertext.length; offset += engine.getBlockSize()) {
		engine.processBlock(ciphertext, offset, plaintext, offset);
	}
	return plaintext;
}

This is similar to the foobar.f.h method from earlier, but rather than using AES in CBC mode, it appears to directly apply AES block-by-block, i.e. in ECB mode! This is clearly not recommended practice; however, might not be of practical significance if the secondary key is only ever used once, on one block only.

The decrypted result (call this the ‘device key’) is then passed to some more cryptographic operations:

public foobar.b t(/* ... */) {
	// ...
	byte[] bArr32 = new byte[8];
	String s32 = foobar.f.s(deviceKey, bArr32);
	bArr32[7] = 1;
	String s42 = foobar.f.s(deviceKey, bArr32);
	bArr32[7] = 2;
	String substring32 = (s32 + s42).substring(0, 8);
	edit.putString("counter", y(this.i, foobar.e.a(bArr32)));
	// ...
}

The value of substring32 is then transmitted to the server as a confirmation code, which completes the registration process. The foobar.f.s method reads:

public static String s(byte[] bArr, byte[] bArr2) {
	c3 c3Var = new c3(bArr);
	l2 l2Var = new l2();
	l2Var.j(true, c3Var);
	int g = l2Var.g();
	byte[] bArr3 = new byte[g];
	System.arraycopy(bArr2, 0, bArr3, 0, Math.min(g, bArr2.length));
	byte[] bArr4 = new byte[l2Var.g()];
	l2Var.i(bArr3, 0, bArr4, 0);
	char[] cArr = new char[i];
	for (int i2 = 0; i2 < 8; i2++) {
		cArr[i2] = (char) (((bArr4[i2] & 255) % 10) + '0');
	}
	return new String(cArr);
}

As we have previously seen, l2 is AESEngine from Bouncy Castle. This code, then, takes bArr2, truncates or pads it to the size of an AES block, performs AES encryption using bArr as the key, then generates an 8-digit numeric code using the first 8 bytes of the ciphertext, via a modulo operation. Interestingly, since 256 is not evenly divisible by 10, the digits of the code will be slightly biased towards the lower digits, though this is of little practical significance.

Returning to the t method, then, an internal 8-byte counter is initialised with all zeros, and passed to foobar.f.s with the device key. The counter is then incremented to 1, and again passed to that method. The resulting codes are combined, and truncated to 8 digits (interestingly, the second call to foobar.f.s is therefore unnecessary, as all required digits will be obtained from the first call). The counter is then incremented to 2. We will again meet the counter, and the foobar.f.s method, later.

Following this, the decompiled code proceeds to generate a number of random byte arrays and pass the device key into more cryptographic functions. Data is then written to shared preferences under such labels as ‘encrypted key’, ‘salt’ and ‘IV’. These values are doubly encrypted with an AES key from the Android hardware-backed keystore. A substantial amount of effort appears to be spent protecting the device key on disk; however, this is moot because we have already obtained the device key at this point, so we need not consider these measures further.

Intercepting request data

Recall from earlier that requests are actioned in this 2FA app through sending a notification to the app. In this case, messages are delivered through Firebase Cloud Messaging (FCM, previously Google Cloud Messaging). The FCM documentation tells us ‘To receive messages, use a service that extends FirebaseMessagingService’. We identify such a class, and can Smali patch the onMessageReceived method to log the notifications received:

.class public Lcom/example/foobar/FirebaseMessagingServiceImpl;
.super Lcom/google/firebase/messaging/FirebaseMessagingService;

# ...

.method public onMessageReceived(Lcom/google/firebase/messaging/RemoteMessage;)V
	.locals 2
	
	# Log the message
	const-string v0, "FOOBAR: FirebaseMessagingServiceImpl.onMessageReceived"
	invoke-virtual {p1}, Lcom/google/firebase/messaging/RemoteMessage;->toString()Ljava/lang/String;
	move-result-object v1
	invoke-static {v0, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
	invoke-virtual {p1}, Lcom/google/firebase/messaging/RemoteMessage;->getData()Ljava/util/Map;
	move-result-object v1
	invoke-interface {v1}, Ljava/util/Map;->toString()Ljava/lang/String;
	move-result-object v1
	invoke-static {v0, v1}, Landroid/util/Log;->v(Ljava/lang/String;Ljava/lang/String;)I
	
	# ...

We identify that the notification message is JSON data in similar format to the registration QR code, similarly signed using RSA. Investigating the decompiled app code, and cross referencing magic numbers and constant strings with Bouncy Castle, we can determine that the notification message is encrypted using one of the secondary keys previously obtained. This is different to the AES key used in ECB mode, so there is no issue there with key reuse in ECB mode. Here, a different secondary AES key is used, in the more appropriate CBC mode. However, the decryption is initialised with a hardcoded all-zero IV! Again, this is not recommended practice, and is more problematic than the earlier use of ECB mode, because this secondary key is reused whenever a request is made – although in this case, the information leaked from IV reuse may not be of great interest.

The decrypted payload contains an HTML page describing the requested action, which is presented to the user to approve or reject.

Approving requests

Following the decompiled code, we observe that when the user approves or rejects a request on the 2FA app, a SHA-256 hash is computed of the internal 8-byte counter, the HTML describing the requested action, and a string indicating the selected outcome (approved or rejected). The SHA-256 hash is then passed to the same foobar.f.s method from earlier, to generate an 8-digit authorisation code. The authorisation code is then transmitted to the server. The internal counter is advanced using the following method:

public static void z(byte[] bArr) {
	for (int length = bArr.length - 1; length >= 0; length--) {
		if (bArr[length] != 9) {
			bArr[length] = (byte) (bArr[length] + 1);
			return;
		}
		bArr[length] = 0;
	}
}

Interestingly, this code demonstrates that the internal counter is stored as binary-coded decimal, i.e. each byte only takes values 0 through 9. The purpose of doing so, rather than simply storing the entire counter as an integer, is unclear.

This system is similar to HOTP in that the 2FA app (and, presumably, the server) maintain an internal counter, and the protocol is dependent on the value of the counter. Presumably, this is intended to enable the server to detect and block replay attacks. The process also demonstrates the supposed advantage of the proprietary 2FA system compared with TOTP/HOTP, in that the authorisation code not only demonstrates knowledge of the secret device key, but is also associated with the specific request data, and the outcome of the request – so a man-in-the-middle cannot intercept the authorisation code but change the outcome from rejection to approval.

As before, the AES encryption in this step uses ECB mode – but because the ciphertext is the result of a cryptographic hash operation which can be regarded as essentially random, and because the output is further mangled into an 8-digit code, this may not be of practical significance. What is more interesting is that AES, a symmetric scheme, is used at all. This protocol is not zero-knowledge – because the scheme used is symmetric, both the server and 2FA app have access to the device key as a shared secret. Crucially, this means that despite being described as providing a ‘signature’, the protocol lacks non-repudiation – while the server can be confident that the 2FA user has authorised a particular request, it cannot later prove this to any third party, or indeed to the user, since it may have forged the authorisation code using its own knowledge. This seems undesirable in high-stakes transactional contexts such as banking, where a bank would (having spent presumably a lot of money on a proprietary 2FA system) presumably like to be able to rely on its logs to fend off any accusation from a disgruntled customer that it has acted without authorisation.

Conclusion

The exploration of this 2FA system has not demonstrated any great fundamental unsoundness; however, it has revealed some interesting design choices and possible learning points.

It is not at all clear that the entire protocol was developed by one person or unified team. Although TLS is used for all client-server communication, additional protection is invariably added, but in different forms in different situations. In one context, RSA-wrapped AES (albeit with suboptimal choice of padding) is used to protect sensitive private device identifiers, but no similar construction is used to protect other sensitive information. Symmetric AES encryption is usually used, but in different modes of operation. Only in some cases in AES used in an appropriate CBC mode with random IV; in other cases, the IV is reused or ECB mode is used – although this might not be of practical importance in this particular app, it suggests there was not a unified approach to cryptography with best practices in mind. The use of AES, rather than an actual asymmetric signature scheme capable of providing non-repudiation, is also unusual to provide what is described as a ‘signature’.

As indicated, this app used Bouncy Castle as its cryptographic library. Bouncy Castle, and the Java Cryptography Extension API on which it builds, are low-level APIs and well known for making it easy to make suboptimal choices – to use Bouncy Castle for encryption requires the developer to immediately choose a cipher, key size, block cipher mode of operation, padding scheme, key derivation function, optional authentication scheme, and so on. A suboptimal choice in any of these areas – say, an insecure default preserved for API backwards compatibility, or an outdated tutorial – can lead to failures in ways that will not be obvious in ordinary usage. This is the prime motivation behind recent higher-level approaches to cryptography libraries, such as NaCl/libsodium, which abstract away implementation details with the intention of making appropriate choices the path of least resistance for developers.

Footnote

This is a suboptimal choice, because PKCS#1 v1.5 padding in RSA encryption (as opposed to RSA signatures) may render the scheme vulnerable to a chosen-ciphertext attack. I cannot say whether the preconditions are met in this case for the attack to be viable; suffice to say it is a suboptimal choice because it makes an attack at least possible in some cases. ↩