Commit graph

21131 commits

Author SHA1 Message Date
Eric Eastwood
da396a2538 Add test for what happens when side by side spans in with statement 2022-08-02 13:43:06 -05:00
Eric Eastwood
b09651a00a Always return config path for config error 2022-08-02 13:31:23 -05:00
Eric Eastwood
fb0e8203ca More clear method names 2022-08-02 13:27:45 -05:00
Eric Eastwood
36d6648fad Remove type ignore comments
See https://github.com/matrix-org/synapse/pull/13400#discussion_r935887649
2022-08-02 13:22:58 -05:00
Eric Eastwood
0f93ec8d59 Fix lints 2022-08-02 12:49:57 -05:00
Eric Eastwood
dbd9005cd1 Revert crazy custom sampler and span process to try force tracing for users 2022-08-02 11:56:51 -05:00
Eric Eastwood
6bb7cb7166 Revert "Non-working try baggage to inherit force tracing/sampling"
This reverts commit d15fa457c9.
2022-08-02 11:43:28 -05:00
Eric Eastwood
d15fa457c9 Non-working try baggage to inherit force tracing/sampling 2022-08-02 11:43:17 -05:00
Eric Eastwood
b3cdbad985 PoC force tracing
Doesn't force tracing for the child spans yet
2022-08-02 02:36:56 -05:00
Eric Eastwood
6255a1a622 Fix tests and some lints 2022-08-01 19:07:11 -05:00
Eric Eastwood
00be06cfd9 Try to align read from edu content 2022-08-01 17:54:14 -05:00
Eric Eastwood
8e902b858d Remove what's left of scopemanager 2022-08-01 17:38:07 -05:00
Eric Eastwood
a9fb504dcd Implement start_active_span_from_edu for OTEL
AFAICT, this never worked before because everything was serialized into `content["org.matrix.opentracing_context"]`
but `start_active_span_from_edu` read from `content["opentracing"]`.
See https://github.com/matrix-org/synapse/pull/5852#discussion_r934960586

Do we even still want this?
2022-08-01 17:23:19 -05:00
Eric Eastwood
33fd24e48c todos 2022-08-01 16:21:40 -05:00
Eric Eastwood
322da5137f Fix some lints 2022-08-01 14:42:13 -05:00
Eric Eastwood
7772f50e60 Use HTTP_HOST attribute 2022-07-30 02:07:46 -05:00
Eric Eastwood
070195afee Use correct type for what start_as_current_span returns
See:

 - https://github.com/open-telemetry/opentelemetry-python/pull/198#discussion_r333399436
 - https://github.com/open-telemetry/opentelemetry-python/issues/219
2022-07-29 22:49:34 -05:00
Eric Eastwood
d84815663e Passing tests and context manager doesn't seem to be needed 2022-07-29 22:44:21 -05:00
Eric Eastwood
041acdf985 Working second test although it's a bit pointless testing whether opentelemetry works 2022-07-29 22:18:59 -05:00
Eric Eastwood
d29a4af916 Move to start_active_span 2022-07-29 22:08:11 -05:00
Eric Eastwood
7c135b93bd Easier to follow local vs remote span tracing
The `incoming-federation-request` vs `process-federation_request` was first introduced in
https://github.com/matrix-org/synapse/pull/11870

 - Span for remote trace: `incoming-federation-request`
    - `child_of` reference: `origin_span_context`
    - `follows_from` reference: `servlet_span`
 - Span for local trace: `process-federation-request`
    - `child_of` reference: `servlet_span` (by the nature of it being active)
    - `follows_from` reference: `incoming-federation-request`
2022-07-29 21:49:47 -05:00
Eric Eastwood
786dd9b4b1 Explain weird function 2022-07-29 17:06:43 -05:00
Eric Eastwood
19d20b50e8 Record exception 2022-07-29 16:54:26 -05:00
Eric Eastwood
2011ac2100 Fix using wrong type of context (Context vs SpanContext)
Fix error:
```
AttributeError: 'SpanContext' object has no attribute 'get'
```

`Context`:
```
{'current-span-1a226c96-a5db-4412-bcaa-1fdd34213c5c': _Span(name="sendToDevice", context=SpanContext(trace_id=0x5d2dcc3fdc8205046d60a5cd18672ac6, span_id=0x715c736ff5f4d208, trace_flags=0x01, trace_state=[], is_remote=False))}
```

`SpanContext`:
```
SpanContext(trace_id=0xf7cd9d058b7b76f364bdd649c4ba7b8a, span_id=0x287ce71bac31bfc4, trace_flags=0x01, trace_state=[], is_remote=False)
```
2022-07-29 00:50:37 -05:00
Eric Eastwood
1d208fa17e Fix invalid attribute type
```
Invalid type StreamToken for attribute value. Expected one of ['bool', 'str', 'bytes', 'int', 'float'] or a sequence of those types
```

Had to add a few more logs to find this instance since the warning doens't give much info where I am setting this invalid attribute.
This was good enough to find it in the code.
```
BoundedAttributes __setitem__ key=since_token value=StreamToken(room_key=RoomStreamToken(topological=None, stream=1787, instance_map=frozendict.frozendict({})), presence_key=481272, typing_key=0, receipt_key=340, account_data_key=1233, push_rules_key=8, to_device_key=57, device_list_key=199, groups_key=0)

BoundedAttributes __setitem__ key=now_token value=StreamToken(room_key=RoomStreamToken(topological=None, stream=1787, instance_map=frozendict.frozendict({})), presence_key=481287, typing_key=0, receipt_key=340, account_data_key=1233, push_rules_key=8, to_device_key=57, device_list_key=199, groups_key=0)

BoundedAttributes __setitem__ key=token value=StreamToken(room_key=RoomStreamToken(topological=None, stream=1787, instance_map=frozendict.frozendict({})), presence_key=481291, typing_key=0, receipt_key=340, account_data_key=1237, push_rules_key=8, to_device_key=57, device_list_key=199, groups_key=0)
```
2022-07-29 00:25:03 -05:00
Eric Eastwood
1b0840e3aa Fix some lints 2022-07-28 19:43:43 -05:00
Eric Eastwood
3a259960fb Fixup some todos 2022-07-28 00:18:08 -05:00
Eric Eastwood
f6c3b22a21 Fix some lints 2022-07-27 16:49:00 -05:00
Eric Eastwood
9e1de8696c We use the config for the Jaeger exporter now 2022-07-27 12:52:51 -05:00
Eric Eastwood
0d7a2b93cf Revert changes to Sentry scopes (not OTEL)
See https://github.com/matrix-org/synapse/pull/13400#discussion_r931325627
2022-07-27 12:52:10 -05:00
Eric Eastwood
242817213e Export to Jaeger (things are showing up) 2022-07-27 02:36:10 -05:00
Eric Eastwood
6406fd5d84 Server running 2022-07-27 01:12:48 -05:00
Eric Eastwood
6984cefa79 Progress towards OTEL 2022-07-27 00:55:43 -05:00
Eric Eastwood
2fe6911957 Some shim and some new 2022-07-26 21:53:11 -05:00
Eric Eastwood
0cc610ecbe Migrate to OpenTelemetry tracing
See https://github.com/matrix-org/synapse/issues/11850
2022-07-26 18:44:21 -05:00
Eric Eastwood
357561c1a2
Backfill remote event fetched by MSC3030 so we can paginate from it later (#13205)
Depends on https://github.com/matrix-org/synapse/pull/13320

Complement tests: https://github.com/matrix-org/complement/pull/406

We could use the same method to backfill for `/context` as well in the future, see https://github.com/matrix-org/synapse/issues/3848
2022-07-22 16:00:11 -05:00
Richard van der Hoff
c7c84b81e3
Update config_documentation.md (#13364)
"changed in" goes before the example
2022-07-22 13:50:20 +01:00
Sean Quah
0fa41a7b17
Update locked frozendict version to 2.3.3 (#13352)
frozendict 2.3.3 includes fixes for memory leaks that get triggered during `/sync`.
2022-07-22 10:26:09 +01:00
Sean Quah
158782c3ce
Skip soft fail checks for rooms with partial state (#13354)
When a room has the partial state flag, we may not have an accurate
`m.room.member` event for event senders in the room's current state, and
so cannot perform soft fail checks correctly. Skip the soft fail check
entirely in this case.

As an alternative, we could block until we have full state, but that
would prevent us from receiving incoming events over federation, which
is undesirable.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-07-22 10:13:01 +01:00
Nick Mills-Barrett
86e366a46e
Remove old empty/redundant slaved stores. (#13349) 2022-07-21 17:56:45 +00:00
Erik Johnston
0b87eb8e0c
Make DictionaryCache have better expiry properties (#13292) 2022-07-21 17:13:44 +01:00
Erik Johnston
13341dde5a
Don't hold onto full state in state cache (#13324) 2022-07-21 16:02:02 +01:00
Brendan Abolivier
10e4093839
Call out buildkit is required when building test docker images (#13338)
Co-authored-by: David Robertson <davidr@element.io>
2022-07-21 14:29:58 +02:00
David Robertson
34949ead1f
Track DB txn times w/ two counters, not histogram (#13342) 2022-07-21 13:23:05 +01:00
Patrick Cloke
50122754c8
Add missing types to opentracing. (#13345)
After this change `synapse.logging` is fully typed.
2022-07-21 12:01:52 +00:00
Nick Mills-Barrett
190f49d8ab
Use cache store remove base slaved (#13329)
This comes from two identical definitions in each of the base stores, and means the base slaved store is now empty and can be removed.
2022-07-21 11:51:30 +01:00
David Robertson
4f57ef0b18
Merge branch 'master' into develop 2022-07-21 11:27:08 +01:00
David Teller
b909d5327b
Document rc_invites.per_issuer, added in v1.63.
Resolves #13330.
Missed in #13125.

Signed-off-by: David Teller <davidt@element.io>
2022-07-21 11:26:34 +01:00
Eric Eastwood
0f971ca68e
Update get_pdu to return the original, pristine EventBase (#13320)
Update `get_pdu` to return the untouched, pristine `EventBase` as it was originally seen over federation (no metadata added). Previously, we returned the same `event` reference that we stored in the cache which downstream code modified in place and added metadata like setting it as an `outlier`  and essentially poisoned our cache. Now we always return a copy of the `event` so the original can stay pristine in our cache and re-used for the next cache call.

Split out from https://github.com/matrix-org/synapse/pull/13205

As discussed at:

 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918365746
 - https://github.com/matrix-org/synapse/pull/13205#discussion_r918366125

Related to https://github.com/matrix-org/synapse/issues/12584. This PR doesn't fix that issue because it hits [`get_event` which exists from the local database before it tries to `get_pdu`](7864f33e28/synapse/federation/federation_client.py (L581-L594)).
2022-07-20 15:58:51 -05:00
Shay
a1b62af2af
Validate federation destinations and log an error if server name is invalid. (#13318) 2022-07-20 11:17:26 -07:00