WhatsApp shares your contacts’ status with you.

TL;DR: You can protect yourself from this hack by changing your account privacy settings. By default, WhatsApp shares your status with others.
Since nobody is changing settings nowadays, this hack works almost all the time.

DISCLAIMER: This is a proof of concept to raise awareness and a bit of a technical challenge before anything else. Don't use my source code provided to track someone, don't be a dick ❤️.


WhatsApp in Android

Exploit the feature

I want to exploit this feature to track users (for science). My first question is: How does this feature works?

To make things up, I am using https://web.whatsapp.com/ in my laptop web browser instead of my Android Smartphone. So I’ll deal with regular web reverse engineering to get this exploit done. I skip Android app reverse engineering for another time.

I pick a friend of mine’s phone and look at how his status behaves on my side.

Initially, the status is offline, and in this case, WhatsApp gives you an absolute date, last seen 16/03/2020 at 15:40.

I’m unlocking the friend’s phone and opening an app (not WhatsApp), doing that for a minute, nothing on my side.
Ok; Now switching to WhatsApp. The status has changed to online 10 seconds later. I didn’t go to the conversations I am sharing with this phone/contact to verify that the status transmits without this condition.

The online status remains until you leave WhatsApp or shut off the screen on the targeted phone.
And then, it reverts to a new last seen date once offline again.

So to summarize the thing:

  • We won’t be able to track someone using his phone globally (hopefully!)
  • But we can track the WhatsApp usage of anyone we have in our contacts
  • The info leaked are last seen date and the live online status per contact
  • We can expect to have at least a minute of accuracy for the last seen date
  • And the online status shows up once WhatsApp is in the foreground for at least 5-10 seconds

Technical analysis

I’m opening the Firefox debugger (Proudly using Firefox again !) to see how the front of WhatsApp web is fetching the coveted data.

The front uses a web socket communication to gather the data in real-time, somewhat every 10-15 seconds.

If we look carefully, the front seems to poke the server every ~15 seconds with ?,, and most of the time follows a reply of !{timestamp}. A kind of keep-alive stuff. Not interesting for us.

The server pushes another kind of message to the front when the status of the contact changes.


The id value I partially cover in black is the phone number, type is the available/unavailable flag, t is the timestamp of our last seen date. The whole payload is encapsulated in a Presence object, easy to recognize.
The timestamp is matching what we read in the UI.


using https://www.epochconverter.com/

Limitations

To receive the presence events from the server via the web socket com, we (the front) subscribe to a specific phone number (id). It is triggered when we select another conversation/contact with the web interface.

So, in this conception, we only receive the active contact’s presence events. In other words, we can only track one contact at a time in the web socket connection. Too bad for us!

WhatsApp also prevents us from opening several concurrent instances (of the same cookies). So we can’t open two web-socket channels altogether. It would have been too easy!

And finally, this one-WhatsApp-web-session-at-a-time behavior still applies when trying two independent sessions (not the same cookies). A new session will trigger the older one to close, and especially at the web-socket layer.

Another expected limitation, the validity of our session is limited in time. Mine will expire the 22/10/2020, in 6 months+. It’s odd to retrieve this info on the front side like this. I might be misinterpreting this one.

Naïve implementation

Now that we’ve defined what is the status feature of WhatsApp and how it could be misused to track users, it is time to code something. We also looked at the technical implementation and for a possible easy-security-flaw.

We could re-code the web socket communication exchange to retrieve the status data, but this will be complex. Too complex if we can only track one contact at a time. I will start with high-level techno and accept the current known limitations and see where it goes.

My idea is to see where we can go with cheap hacking work before doing advanced things.

I’ll decompose the proof of concept into three steps:

  • Gather the data
  • Store the data (easy)
  • Visualize the data (easy, but challengy for me)

I’ll scrape the data using Node.js and Puppeteer; Puppeteer allows us to control a browser and interact the same way a user would do with the mouse and keyboard. It avoids doing complex reverse engineering at the web socket level, and that’s why I picked that up. I’m more used to Selenium + C#. This is my first puppeteer experiment, so be kind to me.

We got the core stuff in 38 lines of code.

To continue, we need to parse the last seen today at 13:15 format into a proper date format. To do that, I’m using the so-wonderful chrono-node npm package.

Finally, I implement a loop in the code to scan the status constantly and store it into InfluxDB 2.0.
InfluxDB is a time-series database. That’s perfect for our use case.

I will derive the last seen date into an offline since UInteger. It will be the seconds counter since the last seen date.
offline since will value 0 when the status is online.
Deriving our data is turning our event-based data into time-series data. This design fits better for InfluxDB and especially for Grafana whose will display our data. And that’s stateless; I like that.

To store the data into InfluxDB 2.0, I’m using the Node.js client with the line protocol format of InfluxDB.

measurementName,tagKey=tagValue fieldKey="fieldValue" 1465839830100400200
--------------- --------------- --------------------- -------------------
       |               |                  |                    |
  Measurement       Tag set           Field set            Timestamp

The data stored looks like this:

status,contactName=Toto offlineSince=8275u 1465839830100400200
status,contactName=Toto offlineSince=8280u 1465839830100400200
status,contactName=Toto offlineSince=0u 1465839830100400200
status,contactName=Tata offlineSince=0u 1465839830100400200
------ ---------------  ----------------- -------------------
  |            |                |                  |
Measurement Tag set         Field set          Timestamp

The code implementation:

There is an edge case I want to handle: Sometimes, the status does not display at all in WhatsApp.
In this case, we won’t enter a offlineSince measure into the database because we don’t have one. Instead, we will log a statusAvailable measure (being 0 or 1) each time we scan the status.

We now connect Grafana to InfluxDB and create a dashboard to monitor our acquisition. And voilà, let it run.

You can find the source code of this proof of concept here.
We will try to improve this hack later, for another blog entry someday!

Update

You can found the next episode here in which we scale this work to track 5000 Smartphones.