Add documentation on background updates. (#16420)

This commit is contained in:
Patrick Cloke 2023-10-06 07:23:20 -04:00 committed by GitHub
parent 26b960b08b
commit 694802eecd
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
2 changed files with 62 additions and 0 deletions

1
changelog.d/16420.doc Normal file
View file

@ -0,0 +1 @@
Document internal background update mechanism.

View file

@ -150,6 +150,67 @@ def run_upgrade(
...
```
## Background updates
It is sometimes appropriate to perform database migrations as part of a background
process (instead of blocking Synapse until the migration is done). In particular,
this is useful for migrating data when adding new columns or tables.
Pending background updates stored in the `background_updates` table and are denoted
by a unique name, the current status (stored in JSON), and some dependency information:
* Whether the update requires a previous update to be complete.
* A rough ordering for which to complete updates.
A new background updates needs to be added to the `background_updates` table:
```sql
INSERT INTO background_updates (ordering, update_name, depends_on, progress_json) VALUES
(7706, 'my_background_update', 'a_previous_background_update' '{}');
```
And then needs an associated handler in the appropriate datastore:
```python
self.db_pool.updates.register_background_update_handler(
"my_background_update",
update_handler=self._my_background_update,
)
```
There are a few types of updates that can be performed, see the `BackgroundUpdater`:
* `register_background_update_handler`: A generic handler for custom SQL
* `register_background_index_update`: Create an index in the background
* `register_background_validate_constraint`: Validate a constraint in the background
(PostgreSQL-only)
* `register_background_validate_constraint_and_delete_rows`: Similar to
`register_background_validate_constraint`, but deletes rows which don't fit
the constraint.
For `register_background_update_handler`, the generic handler must track progress
and then finalize the background update:
```python
async def _my_background_update(self, progress: JsonDict, batch_size: int) -> int:
def _do_something(txn: LoggingTransaction) -> int:
...
self.db_pool.updates._background_update_progress_txn(
txn, "my_background_update", {"last_processed": last_processed}
)
return last_processed - prev_last_processed
num_processed = await self.db_pool.runInteraction("_do_something", _do_something)
await self.db_pool.updates._end_background_update("my_background_update")
return num_processed
```
Synapse will attempt to rate-limit how often background updates are run via the
given batch-size and the returned number of processed entries (and how long the
function took to run). See
[background update controller callbacks](../modules/background_update_controller_callbacks.md).
## Boolean columns
Boolean columns require special treatment, since SQLite treats booleans the