Some interesting notes on the Matrix protocol, its limitations and comparison with IRC.

A few crucial quotes, as the article itself is voluminous (but very exhaustive!):

Compare this to Matrix: when you send a message to a Matrix homeserver, that server first stores it in its internal SQL database. Then it will transmit that message to all clients connected to that server and room, and to all other servers that have clients connected to that room. Those remote servers, in turn, will keep a copy of that message and all its metadata in their own database, by default forever. On encrypted rooms those messages are encrypted, but not their metadata.

In a federated network, one has to wonder whether GDPR enforcement is even possible at all. But in Matrix in particular, if you want to enforce your right to be forgotten in a given room, you would have to:

  1. Enumerate all the users that ever joined the room while you were there
  2. Discover all their home servers
  3. Start a GDPR procedure against all those servers

Overall, privacy protections in Matrix mostly concern message contents, not metadata. In other words, who’s talking with who, when and from where is not well protected. Compared to a tool like Signal, which goes through great lengths to anonymize that data with features like private contact discovery, disappearing messages, sealed senders, and private groups, Matrix is definitely behind.

This is a known issue (opened in 2019) in Synapse, but this is not just an implementation issue, it’s a flaw in the protocol itself. Home servers keep join/leave of all rooms, which gives clear text information about who is talking to. Synapse logs may also contain privately identifiable information that home server admins might not be aware of in the first place. Those log rotation policies are separate from the server-level retention policy, which may be confusing for a novice sysadmin.

Combine this with the federation: even if you trust your home server to do the right thing, the second you join a public room with third-party home servers, those ideas kind of get thrown out because those servers can do whatever they want with that information. Again, a problem that is hard to solve in any federation.

So while you can workaround a home server going down at the room level, there’s no such thing at the home server level, for user identities. So if you want those identities to be stable in the long term, you need to think about high availability. One limitation is that the domain name (e.g. matrix.example.com) must never change in the future, as renaming home servers is not supported.

As a developer, I find Matrix kind of intimidating. The specification is huge. The official specification itself looks somewhat digestable: it’s only 6 APIs so that looks, at first, kind of reasonable. But whenever you start asking complicated questions about Matrix, you quickly fall into the Matrix Spec Change specification (which, yes, is a separate specification). And there are literally hundreds of MSCs flying around. It’s hard to tell what’s been adopted and what hasn’t, and even harder to figure out if your specific client has implemented it.

Just taking the latest weekly Matrix report, you find that three new MSCs proposed, just last week! There’s even a graph that shows the number of MSCs is progressing steadily, at 600+ proposals total, with the majority (300+) “new”. I would guess the “merged” ones are at about 150.

I’m also worried that we are repeating the errors of the past. The history of federated services is really fascinating:. IRC, FTP, HTTP, and SMTP were all created in the early days of the internet, and are all still around (except, arguably, FTP, which was removed from major browsers recently). All of them had to face serious challenges in growing their federation.

IRC had numerous conflicts and forks, both at the technical level but also at the political level. The history of IRC is really something that anyone working on a federated system should study in detail, because they are bound to make the same mistakes if they are not familiar with it.

  • cstine@lemmy.uncomfortable.business
    link
    fedilink
    arrow-up
    9
    ·
    1 year ago

    IRC is extremely federated: building a network of linked servers sharing the same channels was done pretty early in it’s existance.

    If anything, IRC is more decentralized than ActivityPub-based services, because there’s no ‘home’ server for a given IRC channel, and if thus if a server goes down, you don’t lose all the channels that were created on it.

    • drkt@feddit.dk
      link
      fedilink
      arrow-up
      6
      ·
      1 year ago

      I had no idea IRC channels could live on multiple servers. That’s cool

      • poVoq@slrpnk.net
        link
        fedilink
        arrow-up
        2
        ·
        1 year ago

        IRC used to be fully federated in the early days, but for various reasons, which are eerily similar to some of the much more recent discussions around AP, this detoriated over time and these days, IRCd have various incompatible s2s protocols that are only used for load-balancing more or less.

        I like IRC, but this is a bit of a cautionary tale, what not to do.