Our team started this year with an interesting problem on its hands. As with any unusual project, we had to do quite a bit of research.
Real-time collaboration is a topic that has gained traction in the last decade – and you can see this everywhere. Companies such as Figma, Adobe, and Microsoft have based their entire products on real-time collaboration.
Naturally, we were bound to encounter a client with a creative idea that revolved around real-time.
The problem
As we quickly found out, building an excellent real-time product is complicated. To be able to launch this tool, the product had to:
- Work in real-time
- Display coherent information
- Reliably connect and communicate data
- Store data for long-term persistence
- Efficiently use resources for cost-effectiveness
- Scale to a large pool of users
Some of these points may seem obvious, but we had to figure out solutions to meet each requirement. We had to implement this real-time system for a text editor.
The mirage
It is very common for teams to experience the Dunning-Kruger effect while doing research. Just as we started, we were excited to use the “magic” solutions that we had found.
We had CRDTs, which, simply put, are structures and algorithms that solve the problem of conflicts (i.e. two users write on the same line). Their name is self-explanatory, as it’s an abbreviation of the actual functionality – “Conflict-free Replicated Data Types”.
Besides these, we found out about self-managed cloud solutions, which took away the problem of scaling and were easy to deploy.
Since we had these solutions available, we concluded that this project will be easy to implement. This is how the other 90% of the research started.
The struggle
Initially, we had issues across the board, but the most important issue was about scaling. How do we actually make this scalable? What if a collaborative room gets “split” across containers? Well, according to the literature, that can’t happen.
The technical details of how this edge case happens are a rabbit hole to go into, but in short, you might open up the application, try to collaborate with your friend, and simply not see what they’re doing. That’s problematic since it breaks our very first requirement – work in real-time.
The enlightenment
One of the mantras architects have used since the dawn of containers is:
Containers should be stateless.
This, of course, does not mean that containers should hold no data. We could rephrase it to a more accurate version:
Your foundation of truth and persistence in your system should not lie in ephemeral and mutually exclusive containers.
Humour aside, this means that containers should first and foremost communicate changes. Technically, they still hold state, but their system of truth has changed. They receive updates from an external system rather than imperatively dictating the true state of the system.
We used Redis, with its publish/subscribe capabilities.
The CRDT algorithm we mentioned before works on an input/output basis. You input your changes, then it outputs the calculated state. We can take advantage of this and pipe outputs from other containers into the inputs of others.
One guarantee of CRDT algorithms is that they generate the same output (i.e. conflict-free), even if the order of inputs is not the same.
Piping input into outputs, and vice-versa. 41.6 KB View full-size Download
This solution works, but one of the requirements of the application also states that we have to “store data for long-term persistence”. Yet again, we have to think of containers as ambient entities.
Containers should be stateless.
Solutions for distributed problems should be distributed themselves. So, we can make each container persist changes in the database as long as the changes belong to it.
If changes come from somewhere else, it must mean that whoever sent those changes through Redis already took care of persistence. It’s a “not my business” approach to containers.
Conclusion
Collaborative software is complicated and sometimes hard to de-centralize. Thankfully, the literature on this is continuously expanding.
The industry provides a plethora of solutions, but each one requires possibly more reading than you might expect. If you’re planning to undertake such a project, be prepared to spend months on research.