Contact tracing is an essential tool for health authorities trying to contain the COVID-19 pandemic. Many nations are looking to mobile apps to support and automate contact tracing, to help us unlock our lockdowns and to get our societies and economies moving again. Yet, we live in an age where data privacy, cyber security and fear of government surveillance are centre stage. Can we trust these apps to do the right thing and truly respect our privacy?
- what data you're sharing,
- where it goes,
- who could use it, and
- how it's protected.
I’ll conclude by discussing some potential improvements to the OpenTrace solution, and also look at some alternative contact tracing technologies on the horizon.
This is my personal review of an open-source code base. It is not an official privacy review or security test by Deloitte of any country's contact tracing application.
https://bluetrace.io/ is the homepage for the open-source version of the TraceTogether mobile app originally developed by Singapore's Ministry of Health. BlueTrace standardises the core Bluetooth proximity exchange protocol, potentially letting apps from anywhere in the world work together. As global travel picks back up, this interoperability will be vital.
The BlueTrace team have published an excellent whitepaper describing their solution and implementation choices. At a very high level the solution works like this:
- Users download the app and sign up using their mobile number.
- Short-lived temporary IDs are generated and sent to each app.
- Apps exchange temporary IDs when they detect each other via Bluetooth.
- Infected patients upload their contact history.
- Health authorities decrypt the temporary IDs to identify potential close contacts.
My review focuses on two repositories:
- The Android app https://github.com/opentrace-community/opentrace-android
- The server-side functions (written for Google Firebase) https://github.com/opentrace-community/opentrace-cloud-functions
(Note: BlueTrace is the Bluetooth protocol. OpenTrace is the reference implementation of BlueTrace)
The permissions a mobile app needs are the first line of defence in terms of identifying possible threats to your privacy. Interestingly, if we look at the AndroidManifest.xml we see a concerning permission - ACCESS_FINE_LOCATION.
...wait a minute, the whole selling point of this app is that it doesn't track your location, but the first permission it asks for is to track your location. This can't be right?!
If we dig a bit deeper in the Android developer docs, we see that ACCESS_FINE_LOCATION is needed to access Bluetooth. Technically speaking, Bluetooth can be used to determine your location through beacons etc. but only if someone has prior knowledge of where those other Bluetooth devices are. With phone-to-phone interactions in this use case, health authorities don't have that extra knowledge. A search through the code shows it doesn't actually invoke the Android Location APIs so it's not checking your absolute location via GPS or WiFi scanning.
Significantly, the app doesn't need the ACCESS_BACKGROUND_LOCATION permission that would allow it to silently track your movements without you being aware - this is the government surveillance scenario people are rightly scared of.
The BlueTrace whitepaper describes the choice of Bluetooth over GPS tracking. Not only is Bluetooth better for privacy, it’s also technically superior for close-range proximity detection in urban areas as well as causing less battery drain.
In OnboardingActivity.kt we see the app is using Firebase phone authentication to capture a mobile phone number and bind it to the device. Note that you're not providing any other personal information other than a phone number - the absolute bare minimum needed to contact you if you're potentially exposed to a confirmed case of COVID-19. If you're still concerned, you could use a dedicated device and/or SIM card for your contact tracing app.
The Firebase cloud service becomes the master mapping of phone number -> unique user IDs. These user IDs are used to authenticate all app calls to the server side (Google Cloud Functions plus Cloud Storage buckets). Tracking user activity through this stable, unique user ID might pose a privacy issue if someone had:
- access to the Firebase user registry, to find your mobile number, and
- some external dataset to link your mobile number with your overall identity.
Government agencies do have access to such data, so we must rely on legal protections to ensure invasive data matching doesn’t occur. The Australian government has set a strong precedent on this topic, refusing requests to add law-enforcement capabilities into the contact tracing app currently under development.
BluetoothMonitoringService.kt shows the data actually being stored about each encounter with another device:
- BlueTrace protocol version.
- Temporary ID broadcast by the tracing app on the other device.
- BlueTrace organisation ID- a unique code assigned to your health authority to support cross-border tracing.
- Your mobile device model.
- The other mobile device model.
- RSSI (Received Signal Strength Indication) - how strong the Bluetooth signal was from the other device.
- Bluetooth transmission power level from your device.
- Date and time.
That's it. No personal info. No absolute location data. Once again, the app is collecting only the bare minimum of data required to notify the otherwise-anonymous user of a mobile device that they might have been exposed to the virus.
The signal strength values feed into BlueTrace’s proximity estimation algorithm, which applies the Inverse Square Law to determine how close two devices were based on how strong the detected signal was. The OpenTrace team calibrated the algorithm by measuring the signal strength from various mobile devices. Each device model reports a different signal strength at 2m distance, so the model information is vital for accuracy.
Temporary ID Generation
Temporary IDs are central to BlueTrace's privacy protections. Rather than sharing a long-lived unique ID with every nearby app, BlueTrace swaps temporary IDs that are changed frequently. That way there's no easy way to 'join the dots' between devices and track a single person's interactions.
BlueTrace temporary IDs are much more than just a random number - they include the user's Firebase ID and a validity period, all encrypted. Encryption protects the data and makes sure it can't be tampered with or falsified.
Temporary IDs are generated on the server by the getTempIDs function. By default, each ID has a lifespan of only 15 minutes. They are generated on request from each device in batches of 100 - enough for 24 hours. The batch generation is helpful for users with poor mobile internet connections, but it has the nice side-effect of obscuring when Bluetrace interactions are occurring. If temporary IDs were requested individually when needed, someone could infer two devices were near each other based on them both requesting new IDs at the same time.
By encrypting on the server-side, OpenTrace can better protect the encryption key because it doesn't need to distribute the key to the apps. The apps just treat each temporary ID as opaque bytes.
If you test positive for COVID-19, health authorities will ask you to upload your contact history from your app to assist with contact tracing.
In OpenTrace this critical logic is kept (counterintuitively) in a UI handler action in EnterPinFragment.kt. The workflow is:
- The getHandshakePin cloud function requests a PIN to authenticate the upload.
- The app invokes the getUploadToken cloud function to generate a single-use upload token.
- writeToInternalStorageAndUpload prepares a JSON file containing the upload token and contact history stored locally on the device.
- uploadToCloudStorage uploads the JSON to a Google Cloud Storage bucket.
The uploaded file isn’t encrypted by the app or server code – protecting the uploaded files relies on proper configuration of the Google Cloud Storage bucket encryption.
Once your tracing information is uploaded, health authorities decrypt the contact records to reveal the Firebase user IDs of other users you have been near. They can then get in contact with those users (via their supplied mobile number) to recommend the next steps, which may include testing or quarantine.
There is no automatic notification. Contact tracing experts combine the app’s trace history with interviews and health policies to narrow down the list of high-risk contacts. The TraceTogether team made an explicit decision not to automate notifications. This is understandable for a small country with limited availability of testing kits early in the pandemic, but might need to be revisited in larger populations with greater testing capacity.
Room for Improvement
The BlueTrace protocol is fundamentally sound and goes to great length to protect your privacy. The OpenTrace implementation is more than adequate (despite a deplorable lack of code comments). Especially given its extremely rapid development and rollout as the first wave of the pandemic hit Singapore (hence the deplorable lack of code comments…)
However, there are a few areas that I feel could be improved as the code is evolved and used in other larger countries with more complex health environments. Several similar improvements were also suggested by a research team from Melbourne and Macquarie Universities.
Google is the Single Point of Failure
The proper configuration, security hardening, and monitoring of the Google Cloud services behind OpenTrace are critical to protecting your privacy. Between the various services involved, Google:
- stores your data,
- manages the encryption keys,
- links the mobile number and unique ID of every user,
- executes the serverless functions,
- provides the cryptographic libraries as part of the Cloud Functions runtimes, and
- controls the Android app distribution channels.
With (potentially) complete access to the app and our data, we must simply trust these services and the teams behind them. This trust is well-founded given Google Cloud’s impressive array of compliance certifications (including the Australian Privacy Principles and IRAP (Information Security Registered Assessors Program) security framework). Still, having another party involved would be beneficial at least from a political perspective - perhaps as custodians of the data encryption keys.
Global, symmetric encryption
OpenTrace uses symmetric encryption, using single encryption key for all its cryptography. This is simple and performant, but leaves all data exposed if that key were compromised. A move to asymmetric cryptography would benefit data security and help support nations with more complex health jurisdictions.
In asymmetric (or public key) cryptography, the ‘public’ key used to encrypt data is different to the ‘private’ key need to decrypt it. This split matches the contact tracing app use case well.
- The mobile apps and back-end services only need the public key. Crucially, this means the app administrators can’t decrypt the stored data.
- Health authorities only need the private key. Only they need the ability to decrypt data to support their contact tracing efforts.
A move to asymmetric encryption also helps a single app support multiple health authorities. In many countries, such as Australia, the Federal Government is best placed to roll out the mobile app nationwide, while the individual states and territories run the health services. The single national app could support public keys from every state, while ensuring only the relevant state health authorities could decrypt the data.
Coming Soon - Tracing 2.0
BlueTrace is not the only game in town when it comes to contact tracing tech. Apple and Google have announced a suite of jointly-developed tools to include Bluetooth-based contact tracing directly in their mobile operating systems. These enhancements will allow the full power of Bluetooth-based contact exchange without the compromises the TraceTogether team were forced to make (such as requiring the iOS app to remain in the foreground continuously).
The Apple/Google Contact Tracing Cryptography Specification is more sophisticated than BlueTrace. The addition of a Daily Tracing Key provides a valuable limitation of the ‘blast radius’ should a user’s app or keys be compromised.
The keys are generated locally on each device, trading better decentralisation and privacy for increased complexity and battery consumption.
The handling of contact correlation is also quite different. The Rolling Proximity Identifier, equivalent to the BlueTrace Temporary ID, is derived from the Daily Tracing Key and the ‘Day Number’ (days since UNIX epoch). When an infected patient uploads their tracing history, they really upload their own Daily Tracing Keys, which are then made public. Other apps regularly fetch the list of new Diagnosis Keys, re-calculate the Rolling Proximity IDs and check if they came in to contact with any of those IDs.
As a 2nd generation tracing technology (yep, we’re on 2nd gen already!), the Apple/Google approach has learned from its predecessors and has significantly better scaling, privacy and cryptographic qualities. It also poses new risks. The framework is built into the OS and its permissions model, making it easier for it to fall out of mind and become just another function of your phone. Once the pandemic has passed there might be a temptation for developers to misuse such a feature for commercial purposes e.g. real-time, proximity-based promotion - ‘viral’ marketing indeed!
Contact tracing apps will provide a crucial tool in our fight against the COVID-19 pandemic. It is inspiring to see how technologists all over the world are responding to the challenge - producing solutions that can help give us back our lives, our societies and our economies, while still showing such obvious care for our privacy.
I hope this post has given you greater insight into the technology behind contact tracing. I hope that you, like myself, can have confidence to embrace these applications safely in the knowledge that the lives saved will be well worth the small piece of privacy we give away.