From 52a947dc4603e1bd14916efb8822c4fe58f0d200 Mon Sep 17 00:00:00 2001 From: Richard van der Hoff <1389908+richvdh@users.noreply.github.com> Date: Thu, 10 Mar 2022 15:18:31 +0000 Subject: [PATCH] Updates to the Room DAG concepts development document (#12179) Some stuff that came up while we were talking about #12173. --- changelog.d/12179.doc | 1 + docs/development/room-dag-concepts.md | 71 ++++++++++++++++++++------- 2 files changed, 54 insertions(+), 18 deletions(-) create mode 100644 changelog.d/12179.doc diff --git a/changelog.d/12179.doc b/changelog.d/12179.doc new file mode 100644 index 0000000000..55d8caa45a --- /dev/null +++ b/changelog.d/12179.doc @@ -0,0 +1 @@ +Updates to the Room DAG concepts development document. diff --git a/docs/development/room-dag-concepts.md b/docs/development/room-dag-concepts.md index cbc7cf2949..3eb4d5acc4 100644 --- a/docs/development/room-dag-concepts.md +++ b/docs/development/room-dag-concepts.md @@ -30,13 +30,57 @@ rather than skipping any that arrived late; whereas if you're looking at a historical section of timeline (i.e. `/messages`), you want to see the best representation of the state of the room as others were seeing it at the time. +## Outliers + +We mark an event as an `outlier` when we haven't figured out the state for the +room at that point in the DAG yet. They are "floating" events that we haven't +yet correlated to the DAG. + +Outliers typically arise when we fetch the auth chain or state for a given +event. When that happens, we just grab the events in the state/auth chain, +without calculating the state at those events, or backfilling their +`prev_events`. + +So, typically, we won't have the `prev_events` of an `outlier` in the database, +(though it's entirely possible that we *might* have them for some other +reason). Other things that make outliers different from regular events: + + * We don't have state for them, so there should be no entry in + `event_to_state_groups` for an outlier. (In practice this isn't always + the case, though I'm not sure why: see https://github.com/matrix-org/synapse/issues/12201). + + * We don't record entries for them in the `event_edges`, + `event_forward_extremeties` or `event_backward_extremities` tables. + +Since outliers are not tied into the DAG, they do not normally form part of the +timeline sent down to clients via `/sync` or `/messages`; however there is an +exception: + +### Out-of-band membership events + +A special case of outlier events are some membership events for federated rooms +that we aren't full members of. For example: + + * invites received over federation, before we join the room + * *rejections* for said invites + * knock events for rooms that we would like to join but have not yet joined. + +In all the above cases, we don't have the state for the room, which is why they +are treated as outliers. They are a bit special though, in that they are +proactively sent to clients via `/sync`. ## Forward extremity -Most-recent-in-time events in the DAG which are not referenced by any other events' `prev_events` yet. +Most-recent-in-time events in the DAG which are not referenced by any other +events' `prev_events` yet. (In this definition, outliers, rejected events, and +soft-failed events don't count.) -The forward extremities of a room are used as the `prev_events` when the next event is sent. +The forward extremities of a room (or at least, a subset of them, if there are +more than ten) are used as the `prev_events` when the next event is sent. +The "current state" of a room (ie: the state which would be used if we +generated a new event) is, therefore, the resolution of the room states +at each of the forward extremities. ## Backward extremity @@ -44,23 +88,14 @@ The current marker of where we have backfilled up to and will generally be the `prev_events` of the oldest-in-time events we have in the DAG. This gives a starting point when backfilling history. -When we persist a non-outlier event, we clear it as a backward extremity and set -all of its `prev_events` as the new backward extremities if they aren't already -persisted in the `events` table. - - -## Outliers - -We mark an event as an `outlier` when we haven't figured out the state for the -room at that point in the DAG yet. - -We won't *necessarily* have the `prev_events` of an `outlier` in the database, -but it's entirely possible that we *might*. - -For example, when we fetch the event auth chain or state for a given event, we -mark all of those claimed auth events as outliers because we haven't done the -state calculation ourself. +Note that, unlike forward extremities, we typically don't have any backward +extremity events themselves in the database - or, if we do, they will be "outliers" (see +above). Either way, we don't expect to have the room state at a backward extremity. +When we persist a non-outlier event, if it was previously a backward extremity, +we clear it as a backward extremity and set all of its `prev_events` as the new +backward extremities if they aren't already persisted as non-outliers. This +therefore keeps the backward extremities up-to-date. ## State groups