About a year ago, we began working on a project that required us to develop a social media app where the entire user interaction is based exclusively on voice. The users were to be matched based on their current mood, which they were to select before entering a call.
After consulting with the client, we decided to take a serverless approach to the application’s architecture to reduce development time, increase scalability and allow for a faster release. By doing this, the user will only interact with the services in the cloud, where the cloud functions and third-party services handle all the custom logic necessary to fulfill the project’s requirements.
So far, so good.
But here comes our first main challenge:
As previously stated, the application is supposed to create a voice connection between two users who share the same mood simultaneously. The main problem was that due to the serverless architecture approach chosen, we could not connect to a backend that was always running and simply wait for the connection to happen. Therefore, we had to establish a voice connection between the two users without a server.
So how did we make it work?
To implement the system, we employed these three services offered by the cloud platform:
- A document-based database – We used our document-based database as a queue where users can submit their intents. By creating a document, a user expresses an intent to connect.
- Cloud functions – We delegated all the app logic to our cloud functions, which we set to trigger whenever a user submits a new intent to connect. These cloud functions have the job of trying to match one user with another.
- A cloud messaging service – Our cloud messaging service can notify each user of a new match. Because the search process might take a long time, we’ve decided to use this mechanism to tell the user when he is ready to connect.
For the audio calling functions, we relied on third-party services that ensured a high-quality, stutter-free interactive voice chat experience.
A document would be created in our database for clients who wanted to connect. The document represented their intent to connect and contained details of the connection, such as their current mood (that they had previously selected before making the request). This document would then be stored in a collection along with other intents in a queue.
So now, all we have to do is search for the documents with similar matching intents in the previously created queue.
Whenever we would find two intents that could be matched, we would call on Agora to provide a virtual room for the two users. The room’s id token would then be returned to our function.
Once a room was created, we would use individual messages to notify the devices waiting to connect. These messages contain the room token and other data required for connection.
Once the notification reaches the devices, they will request Agora’s services again to establish the connection. The Agora software will process the two requests and establish the connection between the devices.
As soon as the connection is established, the two users can communicate directly from one device to another, solving the voice communication problem.