Making messaging interoperability with third parties safe for users in Europe

To comply with a new EU law, the Digital Markets Act (DMA), which comes into force on March 7th, we’ve made major changes to WhatsApp and Messenger to enable interoperability with third-party messaging services. 
We’re sharing how we enabled third-party interoperability (interop) while maintaining end-to-end encryption (E2EE) and other privacy guarantees in our services as far as possible.

On March 7th, a new EU law, the Digital Markets Act (DMA), comes into force. One of its requirements is that designated messaging services must let third-party messaging services become interoperable, provided the third-party meets a series of eligibility, including technical and security requirements. 
This allows users of third-party providers who choose to enable interoperability (interop) to send and receive messages with opted-in users of either Messenger or WhatsApp – both designated by the European Commission (EC) as being required to independently provide interoperability to third-party messaging services.  
For nearly two years our team has been working with the EC to implement interop in a way that meets the requirements of the law and maximizes the security, privacy and safety of users. Interoperability is a technical challenge – even when focused on the basic functionalities as required by the DMA. In year one, the requirement is for 1:1 text messaging between individual users and the sharing of images, voice messages, videos, and other attached files between individual end users. In the future, requirements expand to group functionality and calling. 
To interoperate, third-party providers will sign an agreement with Messenger and/or WhatsApp and we’ll work together to enable interoperability. Today we’ll publish the WhatsApp Reference Offer for third-party providers which will outline what will be required to interoperate with the service. The Reference Offer for Messenger will follow in due course. 
While Meta must be ready to enable interoperability with other services within three months of receiving a request, it may take longer before the functionality is ready for public use. We wanted to take this opportunity to set out the technical infrastructure and thinking that sits behind our interop solution.
A privacy-centric approach to building interoperable messaging services
Our approach to compliance with the DMA is centered around preserving privacy and security for users as far as is possible. The DMA quite rightly makes it a legal requirement that we should not weaken security provided to Meta’s own users. 
The approach we have taken in terms of implementing interoperability is the best way of meeting DMA requirements, whilst also creating a viable approach for the third-party providers interested in becoming interoperable with Meta and maximizing user security and privacy.
Implementing an end-to-end encrypted protocol
First, we need to protect the underlying security that keeps communication on Meta E2EE messaging apps secure: the encryption protocol. WhatsApp and Messenger both use the tried and tested Signal Protocol as a foundational piece for their encryption. 
Messenger is still rolling out E2EE by default for personal communication, but on WhatsApp, this default has been the case since 2016. In both cases, we are using the Signal Protocol as the foundation for these E2EE communications, as it represents the current gold standard for E2EE chats.
In order to maximize user security, we would prefer third-party providers to use the Signal Protocol. Since this has to work for everyone however, we will allow third-party providers to use a compatible protocol if they are able to demonstrate it offers the same security guarantees as Signal. 
To send messages, the third-party providers have to construct message protobuf structures which are then encrypted using the Signal Protocol and then packaged into message stanzas in eXtensible Markup Language (XML). 
Meta servers push messages to connected clients over a persistent connection. Third-party servers are responsible for hosting any media files their client applications send to Meta clients (such as image or video files). After receiving a media message, Meta clients will subsequently download the encrypted media from the third-party messaging servers using a Meta proxy service.
It’s important to note that the E2EE promise Meta provides to users of our messaging services requires us to control both the sending and receiving clients. This allows us to ensure that only the sender and the intended recipient(s) can see what has been sent, and that no one can listen to your conversation without both parties knowing. 
While we have built a secure solution for interop that uses the Signal Protocol encryption to protect messages in transit, without ownership of both clients (endpoints) we cannot guarantee what a third-party provider does with sent or received messages, and we therefore cannot make the same promise.
Our technical solution builds on Meta’s existing client / server architecture 
We think the best way to deliver interoperability is through a solution which builds on Meta’s existing client / server architecture [Figure 1]. In particular, the requirement that clients connect to Meta infrastructure has the following benefits, it:

Enables Meta to maximize the level of security and safety for all users by carrying out many of the same  integrity checks as it does for existing Meta users
Constitutes a “plug-and-play” model for third-party providers, lowering the barriers for potential new entrants and costs for third-party providers
Helps maximize protection of user privacy by limiting the exposure of their personal data to Meta servers only
Improves overall reliability of the interoperable service as it benefits from Meta’s infrastructure, which is already globally scaled to handle over 100 billion messages each day

Figure 1: A simplified illustration of WhatsApp’s technical architecture.
Taking the example of WhatsApp, third-party clients will connect to WhatsApp servers using our protocol (based on the Extensible Messaging and Presence Protocol – XMPP). The WhatsApp server will interface with a third-party server over HTTP in order to facilitate a variety of things including authenticating third-party users and push notifications.
WhatsApp exposes an Enlistment API that third-party clients must execute when opting in to the WhatsApp network. When a third-party user registers on WhatsApp or Messenger, they keep their existing user-visible identifier, and are also assigned a unique, WhatsApp-internal identifier that is used at the infrastructure level (for protocols, data storage, etc.) 
WhatsApp requires third-party clients to provide “proof” of their ownership of the third-party user-visible identifier when connecting or enlisting. The proof is constructed by the third-party service cryptographically signing an authentication token. WhatsApp uses the standard OpenID protocol (with some minor modifications) alongside a JSON Web Token (JWT Token) to verify the user-visible identifier through public keys periodically fetched from the third-party server.
WhatsApp uses the Noise Protocol Framework to encrypt all data traveling between the client and the WhatsApp server. As part of the Noise Protocol, the third-party client must perform a “Noise Handshake” every time the client connects to the WhatsApp server. Part of this Handshake is providing a payload to the server which also contains the JWT Token.
Once the client has successfully connected to the WhatsApp server, the client must use WhatsApp’s chat protocol to communicate with the WhatsApp server. WhatsApp’s chat protocol uses optimized XML stanzas to communicate with our servers. 
As we continue to discuss this architecture with third-party providers, we think there is also an approach to implementing interop where we could give third-party providers the option to add a proxy or an “intermediary” between their client and the WhatsApp server. A proxy could potentially give third-party providers more flexibility and control over what their client can receive from the WhatsApp server and also removes the requirement that third-party clients must implement WhatsApp’s client-to-server protocol, i.e. maintain their existing “chat channel” on their clients. 
The challenge here is that WhatsApp would no longer have direct connection to both clients and, as a result, would lose connection level signals that are important for keeping users safe from spam and scams such as TCP fingerprints. We would therefore anticipate implementing additional requirements for third-party providers who take up this option under our Reference Offer. This approach also exposes all the chat metadata to the proxy server, which increases the likelihood that this data could be accidentally or intentionally leaked.
Clearly explaining how interop works to users
We believe it is essential that we give users transparent information about how interop works and how it differs from their chats with other WhatsApp or Messenger users. This will be the first time that users have been part of an interoperable network on our services, so giving them clear and straightforward information about what to expect will be paramount. For example, users need to know that our security and privacy promise, as well as the feature set, won’t exactly match what we offer in WhatsApp chats. 
Privacy and security is a shared responsibility
As is hopefully clear from this post, preserving privacy and security in an interoperable system is a shared responsibility, and not something that Meta is able to do on its own.  We will therefore need to continue collaborating with third-party providers in order to provide the safest and best experience for our users.