Commit graph

721 commits

Author SHA1 Message Date
Erik Johnston 554a92601a
Reintroduce "Reduce device lists replication traffic."" (#17361)
Reintroduces https://github.com/element-hq/synapse/pull/17333


Turns out the reason for revert was down two master instances running
2024-06-25 10:34:34 +01:00
Erik Johnston a98cb87bee
Revert "Reduce device lists replication traffic." (#17360)
Reverts element-hq/synapse#17333

It looks like master was still sending out replication RDATA with the
old format... somehow
2024-06-25 09:57:34 +01:00
Erik Johnston 930a64b6c1
Reintroduce #17291. (#17338)
This is #17291 (which got reverted), with some added fixups, and change
so that tests actually pick up the error.

The problem was that we were not calculating any new chain IDs due to a
missing `not` in a condition.
2024-06-24 14:40:28 +00:00
Erik Johnston cf711ac03c
Reduce device lists replication traffic. (#17333)
Reduce the replication traffic of device lists, by not sending every
destination that needs to be sent the device list update over
replication. Instead a "hosts to send to have been calculated"
notification over replication, and then federation senders read the
destinations from the DB.

For non federation senders this should heavily reduce the impact of a
user in many large rooms changing a device.
2024-06-24 14:15:13 +01:00
Erik Johnston 4243c1f074
Revert "Handle large chain calc better (#17291)" (#17334)
This reverts commit bdf82efea5  (#17291)

This seems to have stopped persisting auth chains for new events, and so
is causing state res to fall back to the slow methods
2024-06-19 17:39:33 +01:00
Erik Johnston bdf82efea5
Handle large chain calc better (#17291)
We calculate the auth chain links outside of the main persist event
transaction to ensure that we do not block other event sending during
the calculation.
2024-06-19 10:33:53 +01:00
Eric Eastwood e5b8a3e37f
Add stream_ordering sort to Sliding Sync /sync (#17293)
Sort is no longer configurable and we always sort rooms by the `stream_ordering` of the last event in the room or the point where the user can see up to in cases of leave/ban/invite/knock.
2024-06-17 11:27:14 -05:00
Quentin Gliech e88332b5f4
Merge branch 'release-v1.109' into develop 2024-06-17 15:51:16 +02:00
Quentin Gliech f983a77ab0
Set our own stream position from the current sequence value on startup (#17309) 2024-06-17 11:50:00 +00:00
Erik Johnston a3cb244755
Automatically apply SQL for inconsistent sequence (#17305)
Rather than forcing the server operator to apply the SQL manually.

This should be safe, as there should be only one writer for these
sequences.
2024-06-14 16:40:29 +01:00
Eric Eastwood 8c58eb7f17
Add event.internal_metadata.instance_name (#17300)
Add `event.internal_metadata.instance_name` (the worker instance that persisted the event) to go alongside the existing `event.internal_metadata.stream_ordering`.

`instance_name` is useful to properly compare and query for events with a token since you need to compare both the `stream_ordering` and `instance_name` against the vector clock/`instance_map` in the `RoomStreamToken`.

This is pre-requisite work and may be used in https://github.com/element-hq/synapse/pull/17293

Adding `event.internal_metadata.instance_name` was first mentioned in the initial Sliding Sync PR while pairing with @erikjohnston, see 09609cb0db (diff-5cd773fb307aa754bd3948871ba118b1ef0303f4d72d42a2d21e38242bf4e096R405-R410)
2024-06-13 11:32:50 -05:00
Eric Eastwood ebdce69f6a
Fix get_last_event_in_room_before_stream_ordering(...) finding the wrong last event (#17295)
PR where this was introduced: https://github.com/matrix-org/synapse/pull/14817

### What does this affect?

`get_last_event_in_room_before_stream_ordering(...)` is used in Sync v2 in a lot of different state calculations.

`get_last_event_in_room_before_stream_ordering(...)`  is also used in `/rooms/{roomId}/members`
2024-06-13 11:00:52 -05:00
Erik Johnston aabf577166
Handle hyphens in user dir search porperly (#17254)
c.f. #16675
2024-06-05 10:40:34 +01:00
Erik Johnston d16910ca02
Replaces all usages of StreamIdGenerator with MultiWriterIdGenerator (#17229)
Replaces all usages of `StreamIdGenerator` with `MultiWriterIdGenerator`, which is safer.
2024-05-30 11:07:32 +00:00
Erik Johnston 466f344547
Move towards using MultiWriterIdGenerator everywhere (#17226)
There is a problem with `StreamIdGenerator` where it can go backwards
over restarts when a stream ID is requested but then not inserted into
the DB. This is problematic if we want to land #17215, and is generally
a potential cause for all sorts of nastiness.

Instead of trying to fix `StreamIdGenerator`, we may as well move to
`MultiWriterIdGenerator` that does not suffer from this problem (the
latest positions are stored in `stream_positions` table). This involves
adding SQLite support to the class.

This only changes id generators that were already using
`MultiWriterIdGenerator` under postgres, a separate PR will move the
rest of the uses of `StreamIdGenerator` over.
2024-05-29 12:19:10 +00:00
Shay 37558d5e4c
Add support for MSC3823 - Account Suspension (#17051) 2024-05-01 17:45:17 +01:00
Melvyn Laïly 59710437e4
Return the search terms as search highlights for SQLite instead of nothing (#17000)
Fixes https://github.com/element-hq/synapse/issues/16999 and
https://github.com/element-hq/element-android/pull/8729 by returning the
search terms as search highlights.
2024-04-26 09:43:52 +01:00
Erik Johnston 55b0aa847a Fix GHSA-3h7q-rfh9-xm4v
Weakness in auth chain indexing allows DoS from remote room members
through disk fill and high CPU usage.

A remote Matrix user with malicious intent, sharing a room with Synapse
instances before 1.104.1, can dispatch specially crafted events to
exploit a weakness in how the auth chain cover index is calculated. This
can induce high CPU consumption and accumulate excessive data in the
database of such instances, resulting in a denial of service.

Servers in private federations, or those that do not federate, are not
affected.
2024-04-23 15:25:49 +01:00
dependabot[bot] 1e68b56a62
Bump black from 23.10.1 to 24.2.0 (#16936) 2024-03-13 16:46:44 +00:00
Erik Johnston 23740eaa3d
Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Erik Johnston 5d3850b038
Port EventInternalMetadata class to Rust (#16782)
There are a couple of things we need to be careful of here:

1. The current python code does no validation when loading from the DB,
so we need to be careful to ignore such errors (at least on jki.re there
are some old events with internal metadata fields of the wrong type).
2. We want to be memory efficient, as we often have many hundreds of
thousands of events in the cache at a time.

---------

Co-authored-by: Quentin Gliech <quenting@element.io>
2024-01-08 14:06:48 +00:00
Patrick Cloke 8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
Erik Johnston ef5329a9f9
Revert "Add a Postgres REPLICA IDENTITY to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456)" (#16651)
This reverts commit 69afe3f7a0.
2023-11-16 16:48:48 +00:00
reivilibre 830988ae72
Fix test not detecting tables with missing primary keys and missing replica identities, then add more replica identities. (#16647)
* Fix the CI query that did not detect all cases of missing primary keys

* Add more missing REPLICA IDENTITY entries

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
2023-11-16 12:26:27 +00:00
David Robertson 43d1aa75e8
Add an Admin API to temporarily grant the ability to update an existing cross-signing key without UIA (#16634) 2023-11-15 17:28:10 +00:00
Patrick Cloke f2f2c7c1f0
Use full GitHub links instead of bare issue numbers. (#16637) 2023-11-15 08:02:11 -05:00
reivilibre 69afe3f7a0
Add a Postgres REPLICA IDENTITY to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456)
* Add Postgres replica identities to tables that don't have an implicit one

Fixes #16224

* Newsfile

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>

* Move the delta to version 83 as we missed the boat for 82

* Add a test that all tables have a REPLICA IDENTITY

* Extend the test to include when indices are deleted

* isort

* black

* Fully qualify `oid` as it is a 'hidden attribute' in Postgres 11

* Update tests/storage/test_database.py

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>

* Add missed tables

---------

Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2023-11-13 16:03:22 +00:00
Patrick Cloke ab3f1b3b53
Convert simple_select_one_txn and simple_select_one to return tuples. (#16612) 2023-11-09 11:13:31 -05:00
David Robertson 91587d4cf9
Bulk-invalidate e2e cached queries after claiming keys (#16613)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-11-09 15:57:09 +00:00
Patrick Cloke 455ef04187
Avoid updating the same rows multiple times with simple_update_many_txn. (#16609)
simple_update_many_txn had a bug in it which would cause each
update to be applied twice.
2023-11-07 14:02:09 -05:00
Patrick Cloke 9738b1c497
Avoid executing no-op queries. (#16583)
If simple_{insert,upsert,update}_many_txn is called without any data
to modify then return instead of executing the query.

This matches the behavior of simple_{select,delete}_many_txn.
2023-11-07 14:00:25 -05:00
Patrick Cloke ec9ff389f4
More tests for the simple_* methods. (#16596)
Expand tests for the simple_* database methods, additionally
test against both PostgreSQL and SQLite variants.
2023-11-07 09:34:23 -05:00
Patrick Cloke cfb6d38c47
Remove remaining usage of cursor_to_dict. (#16564) 2023-10-31 13:13:28 -04:00
Patrick Cloke 679c691f6f
Remove more usages of cursor_to_dict. (#16551)
Mostly to improve type safety.
2023-10-26 15:12:28 -04:00
Patrick Cloke 9407d5ba78
Convert simple_select_list and simple_select_list_txn to return lists of tuples (#16505)
This should use fewer allocations and improves type hints.
2023-10-26 13:01:36 -04:00
Erik Johnston 8f35f8148e
Fix bug where a new writer advances their token too quickly (#16473)
* Fix bug where a new writer advances their token too quickly

When starting a new writer (for e.g. persisting events), the
`MultiWriterIdGenerator` doesn't have a minimum token for it as there
are no rows matching that new writer in the DB.

This results in the the first stream ID it acquired being announced as
persisted *before* it actually finishes persisting, if another writer
gets and persists a subsequent stream ID. This is due to the logic of
setting the minimum persisted position to the minimum known position of
across all writers, and the new writer starts off not being considered.

* Fix sending out POSITIONs when our token advances without update

Broke in #14820

* For replication HTTP requests, only wait for minimal position
2023-10-23 16:57:30 +01:00
Patrick Cloke 6ad1f9eac2
Convert DeviceLastConnectionInfo to attrs. (#16507)
To improve type safety & memory usage.
2023-10-17 12:47:42 +00:00
Patrick Cloke a4904dcb04
Convert simple_select_many_batch, simple_select_many_txn to tuples. (#16444) 2023-10-11 13:24:56 -04:00
Patrick Cloke 85bfd4735e
Return an immutable value from get_latest_event_ids_in_room. (#16326) 2023-09-18 09:29:05 -04:00
Erik Johnston 954921736b
Refactor get_user_by_id (#16316) 2023-09-14 12:46:30 +01:00
Erik Johnston 2b35626b6b
Refactor storing of server keys (#16261) 2023-09-12 11:08:04 +01:00
Patrick Cloke aa483cb4c9
Update ruff config (#16283)
Enable additional checks & clean-up unneeded configuration.
2023-09-08 11:24:36 -04:00
Mathieu Velten dcb2778341
Add last_seen_ts to the admin users API (#16218) 2023-09-04 18:13:28 +02:00
David Robertson 6525fd65ee
Log the details of background update failures (#16212) 2023-09-01 12:41:56 +01:00
Erik Johnston a2e0d4cd60
Fix rare bug that broke looping calls (#16210)
* Fix rare bug that broke looping calls

We can't interact with the reactor from the main thread via looping
call.

Introduced in v1.90.0 / #15791.

* Newsfile
2023-08-30 14:18:42 +01:00
Patrick Cloke 9ec3da06da
Bump mypy-zope & mypy. (#16188) 2023-08-29 10:38:56 -04:00
V02460 84f441f88f
Prepare unit tests for Python 3.12 (#16099) 2023-08-25 15:05:10 -04:00
Patrick Cloke a8a46b1336
Replace simple_async_mock with AsyncMock (#16180)
Python 3.8 has a native AsyncMock, use it instead of a custom
implementation.
2023-08-25 09:27:21 -04:00
Patrick Cloke daf11e26ef
Replace make_awaitable with AsyncMock (#16179)
Python 3.8 provides a native AsyncMock, we can replace the
homegrown version we have.
2023-08-24 19:38:46 -04:00
Neil Johnson ec662bbe41
Filter out unwanted user_agents from udv. (#16124) 2023-08-23 14:00:34 +01:00