Commit graph

35 commits

Author SHA1 Message Date
Patrick Cloke
682151a464
Do not fail completely if oEmbed autodiscovery fails. (#15092)
Previously if an autodiscovered oEmbed request failed (e.g. the
oEmbed endpoint is down or does not exist) then the entire URL
preview would fail. Instead we now return everything we can, even
if this additional request fails.
2023-02-23 16:08:53 -05:00
dependabot[bot]
9bb2eac719
Bump black from 22.12.0 to 23.1.0 (#15103) 2023-02-22 15:29:09 -05:00
Patrick Cloke
148fe58a24
Do not break URL previews if an image is unreachable. (#12950)
Avoid breaking a URL preview completely if the chosen image 404s
or is unreachable for some other reason (e.g. DNS).
2022-06-06 07:46:04 -04:00
Brendan Abolivier
f96b85eca8
Ensure the type of URL attributes is always str when matching against preview blacklist (#12333) 2022-03-31 11:49:49 +02:00
Dirk Klimpel
32c828d0f7
Add type hints to tests/rest. (#12208)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2022-03-11 12:42:22 +00:00
Denis Kasak
337f38cac3
Implement a content type allow list for URL previews (#11936)
This implements an allow list for content types for which Synapse will attempt URL preview. If a URL resolves to a resource with a content type which isn't in the list, the download will terminate immediately.

This makes sense given that Synapse would never successfully generate a URL preview for such files in the first place, and helps prevent issues with streaming media servers, such as #8302.

Signed-off-by: Denis Kasak dkasak@termina.org.uk
2022-02-10 15:43:01 +00:00
Patrick Cloke
807efd26ae
Support rendering previews with data: URLs in them (#11767)
Images which are data URLs will no longer break URL
previews and will properly be "downloaded" and
thumbnailed.
2022-01-24 08:58:18 -05:00
Patrick Cloke
eb39da6782
Move HTML parsing to a separate file for URL previews. (#11566)
* Splits the logic for parsing HTML from the resource handling code.
* Fix a circular import in the oEmbed code (which uses the HTML parsing code).
* Renames some of the HTML parsing methods to:
  * Make it clear which methods are "internal" to the module.
  * Clarify what the methods do.
2021-12-13 17:55:07 +00:00
Patrick Cloke
1b112840d2
Autodiscover oEmbed endpoint from returned HTML (#10822)
Searches the returned HTML for an oEmbed endpoint using the
autodiscovery mechanism (`<link rel=...>`), and will request it
to generate the preview.
2021-10-08 14:14:42 -04:00
Sean Quah
2be0fde3d6
Fix empty url_cache_thumbnails/yyyy-mm-dd/ directories being left behind (#10924) 2021-09-29 10:24:37 +01:00
Sean Quah
f7768f62cb
Avoid storing URL cache files in storage providers (#10911)
URL cache files are short-lived and it does not make sense to offload
them (eg. to the cloud) or back them up.
2021-09-27 12:55:27 +01:00
Patrick Cloke
6fc8be9a1b
Include more information in oEmbed previews. (#10819)
* Improved titles (fall back to the author name if there's not title) and include the site name.
* Handle photo/video payloads.
* Include the original URL in the Open Graph response.
* Fix the expiration time (by properly converting from seconds to milliseconds).
2021-09-22 09:45:20 -04:00
Patrick Cloke
ba7a91aea5
Refactor oEmbed previews (#10814)
The major change is moving the decision of whether to use oEmbed
further up the call-stack. This reverts the _download_url method to
being a "dumb" functionwhich takes a single URL and downloads it
(as it was before #7920).

This also makes more minor refactorings:

* Renames internal variables for clarity.
* Factors out shared code between the HTML and rich oEmbed
  previews.
* Fixes tests to preview an oEmbed image.
2021-09-21 16:09:57 +00:00
Patrick Cloke
580a15e039
Request JSON for oEmbed requests (and ignore XML only providers). (#10759)
This adds the format to the request arguments / URL to
ensure that JSON data is returned (which is all that
Synapse supports).

This also adds additional error checking / filtering to the
configuration file to ignore XML-only providers.
2021-09-08 07:17:52 -04:00
Patrick Cloke
e2481dbe93
Allow configuration of the oEmbed URLs. (#10714)
This adds configuration options (under an `oembed` section) to
configure which URLs are matched to use oEmbed for URL
previews.
2021-08-31 18:37:07 -04:00
Jonathan de Jong
4b965c862d
Remove redundant "coding: utf-8" lines (#9786)
Part of #9744

Removes all redundant `# -*- coding: utf-8 -*-` lines from files, as python 3 automatically reads source code as utf-8 now.

`Signed-off-by: Jonathan de Jong <jonathan@automatia.nl>`
2021-04-14 15:34:27 +01:00
Patrick Cloke
0b3112123d
Use mock from the stdlib. (#9772) 2021-04-09 13:44:38 -04:00
Richard van der Hoff
8d3d264052
Skip unit tests which require optional dependencies (#9031)
If we are lacking an optional dependency, skip the tests that rely on it.
2021-01-07 11:41:28 +00:00
Richard van der Hoff
394516ad1b Remove spurious "SynapseRequest" result from `make_request"
This was never used, so let's get rid of it.
2020-12-15 22:35:40 +00:00
Richard van der Hoff
f347f0cd58
remove unused FakeResponse (#8864) 2020-12-02 18:58:25 +00:00
Richard van der Hoff
129ae841e5 Make make_request actually render the request
remove the stubbing out of `request.process`, so that `requestReceived` also renders the request via the appropriate resource.

Replace render() with a stub for now.
2020-11-16 18:24:00 +00:00
Richard van der Hoff
1f41422c98 Fix the URL in the URL preview tests
the preview resource is mointed at preview_url, not url_preview
2020-11-16 18:24:00 +00:00
Patrick Cloke
c619253db8
Stop sub-classing object (#8249) 2020-09-04 06:54:56 -04:00
Patrick Cloke
3fc8fdd150
Support oEmbed for media previews. (#7920)
Fixes previews of Twitter URLs by using their oEmbed endpoint to grab content.
2020-07-27 07:50:44 -04:00
Andrew Morgan
a48138784e
Allow specifying the value of Accept-Language header for URL previews (#7265) 2020-04-15 13:35:29 +01:00
Richard van der Hoff
e78167c94b
Apply suggestions from code review
Co-Authored-By: Brendan Abolivier <babolivier@matrix.org>
Co-Authored-By: Erik Johnston <erik@matrix.org>
2019-11-05 16:46:39 +00:00
Richard van der Hoff
e9bfe719ba Strip overlong OpenGraph data from url preview
... to stop people causing DoSes with malicious web pages
2019-11-05 15:51:18 +00:00
Amber Brown
0ee9076ffe Fix media repo breaking (#5593) 2019-07-02 19:01:28 +01:00
Amber Brown
32e7c9e7f2
Run Black. (#5482) 2019-06-20 19:32:02 +10:00
Amber Brown
df2ebd75d3
Migrate all tests to use the dict-based config format instead of hanging items off HomeserverConfig (#5171) 2019-05-13 15:01:14 -05:00
Andrew Morgan
2f48c4e1ae
URL preview blacklisting fixes (#5155)
Prevents a SynapseError being raised inside of a IResolutionReceiver and instead opts to just return 0 results. This thus means that we have to lump a failed lookup and a blacklisted lookup together with the same error message, but the substitute should be generic enough to cover both cases.
2019-05-10 10:32:44 -07:00
Amber Brown
ea6abf6724
Fix IP URL previews on Python 3 (#4215) 2018-12-22 01:56:13 +11:00
Richard van der Hoff
828f18bd8b Fix logcontext leak in test_url_preview 2018-11-19 17:07:01 +00:00
Amber Brown
df758e155d
Use <meta> tags to discover the per-page encoding of html previews (#4183) 2018-11-15 11:05:08 -06:00
Amber Brown
b3708830b8
Fix URL preview bugs (type error when loading cache from db, content-type including quotes) (#4157) 2018-11-08 01:37:43 +11:00